-
-
Notifications
You must be signed in to change notification settings - Fork 56
fix: retry file copy on failure due to race #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hmnd
wants to merge
1
commit into
MordechaiHadad:master
Choose a base branch
from
hmnd:push-zzsnvsuuzumt
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,7 +4,9 @@ use reqwest::Client; | |
| use std::env; | ||
| use std::path::{Path, PathBuf}; | ||
| use std::process::Command; | ||
| use std::time::Duration; | ||
| use tokio::fs::{self}; | ||
| use tokio::time::sleep; | ||
| use tracing::info; | ||
|
|
||
| use crate::config::{Config, ConfigFile}; | ||
|
|
@@ -229,6 +231,9 @@ async fn copy_nvim_proxy(config: &ConfigFile) -> Result<()> { | |
| Ok(()) | ||
| } | ||
|
|
||
| const FILE_COPY_MAX_RETRIES: u32 = 5; | ||
| const FILE_COPY_INITIAL_DELAY_MS: u64 = 50; | ||
|
|
||
| /// Asynchronously copies a file from `old_path` to `new_path`, handling specific OS errors. | ||
| /// | ||
| /// This function attempts to copy a file from the specified `old_path` to the specified `new_path`. | ||
|
|
@@ -269,15 +274,37 @@ async fn copy_nvim_proxy(config: &ConfigFile) -> Result<()> { | |
| /// } | ||
| /// ``` | ||
| async fn copy_file_with_error_handling(old_path: &Path, new_path: &Path) -> Result<()> { | ||
| match fs::copy(&old_path, &new_path).await { | ||
| Ok(_) => Ok(()), | ||
| Err(e) => match e.raw_os_error() { | ||
| Some(26) | Some(32) => Err(anyhow::anyhow!( | ||
| "The file {} is busy. Please make sure to close any processes using it.", | ||
| old_path.display() | ||
| )), | ||
| _ => Err(anyhow::anyhow!(e).context("Failed to copy file")), | ||
| }, | ||
| let mut delay = Duration::from_millis(FILE_COPY_INITIAL_DELAY_MS); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason we're mixing synchronous duration calls with asynchronous code execution here? I think it would be better to keep them under the same runtime system. |
||
| let mut attempts = 0; | ||
|
|
||
| loop { | ||
| match fs::copy(&old_path, &new_path).await { | ||
| Ok(_) => return Ok(()), | ||
| Err(e) => { | ||
| let is_file_busy = matches!(e.raw_os_error(), Some(26) | Some(32)); | ||
|
|
||
| if is_file_busy && attempts < FILE_COPY_MAX_RETRIES { | ||
| attempts += 1; | ||
| tracing::debug!( | ||
| "File {} is busy, retrying in {:?} (attempt {}/{})", | ||
| new_path.display(), | ||
| delay, | ||
| attempts, | ||
| FILE_COPY_MAX_RETRIES | ||
| ); | ||
| sleep(delay).await; | ||
| delay *= 2; | ||
| } else if is_file_busy { | ||
| return Err(anyhow!( | ||
| "The file {} is busy after {} attempts. Please make sure to close any processes using it.", | ||
| new_path.display(), | ||
| FILE_COPY_MAX_RETRIES | ||
| )); | ||
| } else { | ||
| return Err(anyhow!(e).context("Failed to copy file")); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested the bug you're getting with the old code, but using tokio::fs (async file operations) to see if it's due to the crossing the runtime boundary?
Perhaps we could do this using a more correct pattern for synchronous FS operations - would also be useful elsewhere in the code-base too.
(Struct/container implementing asref/deref + handling for Drop trait impl. etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, that's probably a better way to fix this. Of course because this bug is tempermental, I can't reproduce it at the moment - even without either fix. I guess because we don't really care abot blocking the main thread in this tool, it would be best to standardize on synchronous FS?
I'm honestly not super well versed in rust, but I kept running into these big issues so wanted to do something about them :p
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way Rust's async runtime works differs quite a bit to other languages (Like Python or JS (Well, Node really). Anything that's blocking will effectively 'halt-the-world' until it's finished, as you're handing over the entire program runtime (which for Rust, is it's runtime and that's it) to tokio. The problem with blocking (especially in a loop like here) is every time we loop and hit a blocking operation, the program effectively 'hiccups' and halts.
I believe that tokio already wraps their struct for file operations to handle handover for their executor, though if you're up for the challenge you're more than welcome to attempt an implementation of a file operation wrapper struct.
Would like be something akin to Arc + probably an atomic to sync on, or a tokio semaphore to cause wait/gather, but you'll also need to poll the future yourself and implement Drop to handle the future completing.
Alternatives are of course handing it off via a task too