Skip to content

devenv: async-ify the core state#1970

Merged
domenkozar merged 15 commits into
mainfrom
asyncify
Jul 7, 2025
Merged

devenv: async-ify the core state#1970
domenkozar merged 15 commits into
mainfrom
asyncify

Conversation

@sandydoo

@sandydoo sandydoo commented Jul 1, 2025

Copy link
Copy Markdown
Member

We have a few places in the codebase that are, as they say, embarrassingly parallel. But any effort to improve on that is hindered by the use of mut self, both in the devenv struct and in the Nix backend.

This PR removes those limitations allowing both devenv and nix commands to be called concurrently.

@sandydoo sandydoo added enhancement New feature or request cli Related to the devenv CLI labels Jul 1, 2025
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jul 1, 2025

Copy link
Copy Markdown

Deploying devenv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 68781ea
Status: ✅  Deploy successful!
Preview URL: https://7f2b51d9.devenv.pages.dev
Branch Preview URL: https://asyncify.devenv.pages.dev

View logs

@cachix cachix deleted a comment from claude Bot Jul 3, 2025
@cachix cachix deleted a comment from claude Bot Jul 3, 2025
@cachix cachix deleted a comment from claude Bot Jul 3, 2025
@cachix cachix deleted a comment from claude Bot Jul 3, 2025
@sandydoo sandydoo marked this pull request as ready for review July 5, 2025 22:10
@sandydoo

sandydoo commented Jul 7, 2025

Copy link
Copy Markdown
Member Author

Random task_cache test failure:

> failures:
       >
       > ---- task_cache::tests::test_file_modification_detection stdout ----
       >
       > thread 'task_cache::tests::test_file_modification_detection' panicked at devenv-tasks/src/task_cache.rs:317:9:
       > assertion failed: !cache.check_modified_files(task_name, &[path_str.clone()]).await.unwrap()
       > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
       >
       >
       > failures:
       >     task_cache::tests::test_file_modification_detection
       >
       > test result: FAILED(B. 25 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.91s
       >
       > error: test failed, to rerun pass `-p devenv-tasks --lib`

@sandydoo

sandydoo commented Jul 7, 2025

Copy link
Copy Markdown
Member Author

Random task_cache test failure:

> failures:
       >
       > ---- task_cache::tests::test_file_modification_detection stdout ----
       >
       > thread 'task_cache::tests::test_file_modification_detection' panicked at devenv-tasks/src/task_cache.rs:317:9:
       > assertion failed: !cache.check_modified_files(task_name, &[path_str.clone()]).await.unwrap()
       > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
       >
       >
       > failures:
       >     task_cache::tests::test_file_modification_detection
       >
       > test result: FAILED(B. 25 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.91s
       >
       > error: test failed, to rerun pass `-p devenv-tasks --lib`

#1985

@cachix cachix deleted a comment from claude Bot Jul 7, 2025
@claude

claude Bot commented Jul 7, 2025

Copy link
Copy Markdown
Contributor

Claude finished @sandydoo's task —— View job


PR Review: devenv async-ify the core state

Todo List:

  • Read main changed files to understand the async refactoring
  • Analyze the failing test issue
  • Review concurrency and error handling patterns
  • Check for potential race conditions or deadlocks
  • Evaluate performance implications
  • Provide comprehensive review feedback

Analysis Summary

This PR successfully refactors the core devenv state to enable concurrent execution by:

  • Converting the Devenv struct to use Arc<RwLock<Config>> for shared state
  • Making all backend methods async via the #[async_trait(?Send)] pattern
  • Removing mut self requirements throughout the codebase
  • Adding proper concurrency primitives like Semaphore for assemble operations

Critical Issues

1. Race Condition in File Modification Detection ⚠️

File: devenv-tasks/src/task_cache.rs:317

The failing test task_cache::tests::test_file_modification_detection indicates a race condition. The test expects check_modified_files to return false (not modified) but it's returning true (modified).

Root Cause: In task_cache.rs:243-246, the file state is updated after detecting a change:

self.update_file_state_with_file(task_name, &current_file).await?;
return Ok(true);

Issue: Multiple concurrent calls to check_modified_files for the same file can race, causing:

  1. Thread A reads file state, detects change
  2. Thread B reads same file state, detects same change
  3. Both threads return true even though only one should

Solution: Add file-level locking or use database transactions to ensure atomic read-modify-write operations.

2. Potential Deadlock in Config Access ⚠️

File: devenv/src/devenv.rs:246, devenv/src/devenv.rs:362

let mut config = self.config.write().await;  // Line 246
// ...
let config_clean = self.config.read().await.clean.clone().unwrap_or_default(); // Line 362

Issue: Multiple write locks or nested read/write locks could cause deadlocks if not carefully ordered.

Recommendation: Minimize lock scope and ensure consistent lock ordering. Consider using try_write() with timeouts for non-critical operations.

3. Inconsistent Error Handling in Concurrent Operations

File: devenv/src/nix.rs:703-715

The concurrent HTTP requests in get_cachix_caches() use future::join_all() but continue processing even when some requests fail:

for result in results {
    match result {
        Ok((name, key)) => { new_known_keys.insert(name, key); }
        Err(e) => { error!("{}", e); } // Just logs, continues
    }
}

Issue: Silent failures could lead to incomplete cache configuration.

Recommendation: Consider failing fast for critical cache operations or implementing retry logic.

Performance Considerations

Positive Improvements ✅

  1. Concurrent Nix Operations: Multiple nix commands can now run in parallel
  2. Concurrent HTTP Requests: Cachix cache fetching is now concurrent (lines 703-715 in nix.rs)
  3. Concurrent Search: Package and options search run concurrently (devenv.rs:595-596)

Potential Performance Regressions ⚠️

  1. Lock Contention: Frequent RwLock access on config could become a bottleneck
  2. Memory Overhead: Arc wrappers add indirection and reference counting overhead
  3. Async Overhead: Every method call now has async/await overhead

Code Quality Issues

1. Missing Documentation on Concurrency Safety

The Devenv struct and key methods lack documentation about their concurrency guarantees.

Recommendation: Add doc comments explaining:

  • Which operations are safe to call concurrently
  • Lock ordering requirements
  • Expected behavior under concurrent access

2. Inconsistent Async Patterns

Some methods use ?Send trait bounds while others don't need them.

File: devenv/src/nix_backend.rs:68

#[async_trait(?Send)]
pub trait NixBackend: Send + Sync {

Issue: The ?Send bound suggests the trait implementations might not be Send, but the trait requires Send + Sync.

3. Error Handling in Concurrent Contexts

Some error paths don't properly handle concurrent state modifications.

Example: In task_cache.rs:209, if update_file_state fails after is_file_modified returns true, the cache could be in an inconsistent state.

Security Considerations

1. Race Conditions in File System Operations

The file modification detection logic could be exploited in scenarios where files are modified concurrently with cache checks.

2. Shared State Access

The RwLock<Config> allows concurrent reads but requires careful handling of sensitive configuration data.

Recommendations

High Priority 🔴

  1. Fix the race condition in task_cache.rs by implementing proper file-level locking
  2. Add integration tests for concurrent operations to catch race conditions
  3. Review lock ordering throughout the codebase to prevent deadlocks

Medium Priority 🟡

  1. Add timeout handling for async operations to prevent hanging
  2. Implement circuit breaker pattern for external HTTP requests in cachix operations
  3. Add metrics/logging for lock contention to identify performance bottlenecks

Low Priority 🟢

  1. Document concurrency guarantees in public APIs
  2. Consider using parking_lot::RwLock for better performance
  3. Add benchmarks comparing sync vs async performance

Test Coverage Gap

The failing test reveals insufficient coverage for concurrent scenarios. Consider adding:

  • Concurrent file modification tests
  • Stress tests with multiple parallel operations
  • Property-based tests for race conditions

Overall Assessment

This is a solid foundation for enabling concurrency in devenv, but the race condition in file modification detection needs immediate attention before merge. The async refactoring is well-executed overall, but production use should wait until the concurrency issues are resolved.

Recommendation: Address the race condition and add integration tests before merging.


@domenkozar domenkozar merged commit 593e03f into main Jul 7, 2025
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli Related to the devenv CLI enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants