ZOOM-FFT算法及其在频率细化中的应用

x1_fft = torch.fft.fft2(x1, norm='ortho') x1_amp = torch.abs(x1_fft) x1_phase = torch.angle(x1_fft) x1_amp, x1_phase代表幅度和相位，如何按照上述进行处理，请告诉我如何通过代码实现

我们正在讨论频域处理中的梯度爆炸问题，特别是使用PyTorch进行FFT2变换后，在幅度和相位处理时如何避免梯度爆炸。根据用户提供的引用，我们知道在音频和雷达信号处理中，频域特征的幅度和相位可能导致训练不稳定。...

function timbre_transfer_2 % 创建主界面 fig = figure('Name', '高级音色转换系统 v3.2', 'Position', [50, 50, 1200, 900], ... 'NumberTitle', 'off', 'MenuBar', 'none', 'Resize', 'on', ... 'CloseRequestFcn', @close_gui, 'Color', [0.94, 0.94, 0.94]); % 全局变量 fs = 44100; % 默认采样率 source_audio = []; % 源音频（提供音色） target_audio = []; % 目标音频（提供内容） converted_audio = []; % 转换后的音频 processing = false; % 处理状态标志 conversion_complete = false; % 转换完成标志 % STFT参数 stft_params.win_len = 2048; % 窗长 stft_params.overlap = 1536; % 重叠点数 (75%) stft_params.nfft = 2048; % FFT点数 stft_params.window = hamming(stft_params.win_len, 'periodic'); % 汉明窗 stft_params.lifter_order = 30; % 包络阶数 stft_params.phase_iter = 5; % 相位迭代次数 stft_params.fs = fs; % 采样率参数 stft_params.hop_size = stft_params.win_len - stft_params.overlap; % 跳跃长度 % 计算合成窗 (确保完美重建) stft_params.win_synthesis = stft_params.window / sum(stft_params.window.^2) * stft_params.hop_size; % === 创建控件 === % 顶部控制面板 control_panel = uipanel('Title', '音频控制', 'Position', [0.02, 0.92, 0.96, 0.07], ... 'BackgroundColor', [0.9, 0.95, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '导入源音频(音色)',... 'Position', [20, 10, 150, 30], 'Callback', @load_source, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.7, 0.9, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '导入目标音频(内容)',... 'Position', [190, 10, 150, 30], 'Callback', @load_target, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.7, 0.9, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '执行音色转换',... 'Position', [360, 10, 150, 30], 'Callback', @transfer_timbre, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.8, 1, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '播放目标音频',... 'Position', [530, 10, 120, 30], 'Callback', {@play_target_audio}, ... 'FontSize', 10, 'BackgroundColor', [1, 0.95, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '播放转换音频',... 'Position', [670, 10, 120, 30], 'Callback', {@play_converted_audio}, ... 'FontSize', 10, 'BackgroundColor', [1, 0.95, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '保存转换音频',... 'Position', [810, 10, 120, 30], 'Callback', @save_audio, ... 'FontSize', 10, 'BackgroundColor', [0.9, 1, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '停止播放',... 'Position', [950, 10, 120, 30], 'Callback', @stop_audio, ... 'FontSize', 10, 'BackgroundColor', [1, 0.8, 0.8]); % 参数控制面板 param_panel = uipanel('Title', 'STFT参数设置', 'Position', [0.02, 0.82, 0.96, 0.09], ... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '窗长:',... 'Position', [20, 40, 50, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); win_len_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.win_len),... 'Position', [80, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '重叠率(%):',... 'Position', [180, 40, 70, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); overlap_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', '75',... 'Position', [260, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', 'FFT点数:',... 'Position', [360, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); nfft_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.nfft),... 'Position', [430, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '包络阶数:',... 'Position', [530, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); lifter_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.lifter_order),... 'Position', [600, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '相位迭代:',... 'Position', [700, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); iter_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.phase_iter),... 'Position', [770, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); % 波形显示区域 - 使用选项卡 tabgp = uitabgroup(fig, 'Position', [0.02, 0.02, 0.96, 0.35]); tab1 = uitab(tabgp, 'Title', '目标音频'); tab2 = uitab(tabgp, 'Title', '转换后音频'); tab3 = uitab(tabgp, 'Title', '源音频'); ax1 = axes('Parent', tab1, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax1, '目标音频波形'); xlabel(ax1, '时间 (s)'); ylabel(ax1, '幅度'); grid(ax1, 'on'); ax2 = axes('Parent', tab2, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax2, '转换后音频波形'); xlabel(ax2, '时间 (s)'); ylabel(ax2, '幅度'); grid(ax2, 'on'); ax3 = axes('Parent', tab3, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax3, '源音频波形'); xlabel(ax3, '时间 (s)'); ylabel(ax3, '幅度'); grid(ax3, 'on'); % 频谱显示区域（只保留三个频谱图） spec_panel = uipanel('Title', '频谱分析', 'Position', [0.02, 0.38, 0.96, 0.43], ... 'BackgroundColor', [0.98, 0.98, 0.98], 'FontWeight', 'bold'); % 增大频谱图尺寸（垂直方向） ax4 = axes('Parent', spec_panel, 'Position', [0.03, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax4, '源音频频谱'); ax5 = axes('Parent', spec_panel, 'Position', [0.36, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax5, '目标音频频谱'); ax6 = axes('Parent', spec_panel, 'Position', [0.69, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax6, '转换后频谱'); % 状态文本 status_text = uicontrol('Style', 'text', 'Position', [50, 5, 900, 30],... 'String', '就绪', 'HorizontalAlignment', 'left',... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [1, 1, 1]); % 进度条 progress_ax = axes('Position', [0.1, 0.97, 0.8, 0.02],... 'XLim', [0, 1], 'YLim', [0, 1], 'Box', 'on', 'Color', [0.9, 0.9, 0.9]); progress_bar = patch(progress_ax, [0 0 0 0], [0 0 1 1], [0.2, 0.6, 1]); axis(progress_ax, 'off'); progress_text = uicontrol('Style', 'text', 'Position', [500, 970, 200, 20],... 'String', '', 'HorizontalAlignment', 'center',... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [1, 1, 1]); % 诊断信息面板 diag_panel = uipanel('Title', '处理日志', 'Position', [0.02, 0.02, 0.96, 0.35], ... 'BackgroundColor', [0.95, 0.95, 0.95], 'Visible', 'off'); diag_text = uicontrol('Parent', diag_panel, 'Style', 'listbox', ... 'Position', [10, 10, 1140, 250], 'String', {'系统已初始化'}, ... 'HorizontalAlignment', 'left', 'FontSize', 9, ... 'BackgroundColor', [1, 1, 1], 'Max', 100, 'Min', 0); % 添加显示/隐藏日志按钮 uicontrol('Style', 'pushbutton', 'String', '显示日志',... 'Position', [1125, 928, 120, 30], 'Callback', @toggle_log, ... 'FontSize', 9, 'BackgroundColor', [0.9, 0.95, 1]); % === 回调函数 === % 更新参数回调 function update_params(~, ~) try % 获取新参数值 new_win_len = str2double(get(win_len_edit, 'String')); overlap_percent = str2double(get(overlap_edit, 'String')); new_nfft = str2double(get(nfft_edit, 'String')); lifter_order = str2double(get(lifter_edit, 'String')); phase_iter = str2double(get(iter_edit, 'String')); % 验证参数 if isnan(new_win_len) || new_win_len <= 0 || mod(new_win_len, 1) ~= 0 error('窗长必须是正整数'); end if isnan(overlap_percent) || overlap_percent < 0 || overlap_percent > 100 error('重叠率必须是0-100之间的数字'); end if isnan(new_nfft) || new_nfft <= 0 || mod(new_nfft, 1) ~= 0 error('FFT点数必须是正整数'); end if isnan(lifter_order) || lifter_order <= 0 || mod(lifter_order, 1) ~= 0 error('包络阶数必须是正整数'); end if isnan(phase_iter) || phase_iter <= 0 || mod(phase_iter, 1) ~= 0 error('相位迭代次数必须是正整数'); end % 更新参数 stft_params.win_len = new_win_len; stft_params.overlap = round(overlap_percent/100 * new_win_len); stft_params.nfft = new_nfft; stft_params.window = hamming(new_win_len, 'periodic'); stft_params.lifter_order = lifter_order; stft_params.phase_iter = phase_iter; stft_params.hop_size = stft_params.win_len - stft_params.overlap; stft_params.win_synthesis = stft_params.window / sum(stft_params.window.^2) * stft_params.hop_size; update_diag(sprintf('参数更新: 窗长=%d, 重叠=%d(%.0f%%), FFT=%d', ... new_win_len, stft_params.overlap, overlap_percent, new_nfft)); catch e errordlg(['参数错误: ', e.message], '输入错误'); update_diag(['参数错误: ', e.message], true); end end % 更新诊断信息 function update_diag(msg, force) if nargin < 2, force = false; end if ~conversion_complete || force current = get(diag_text, 'String'); new_msg = sprintf('[%s] %s', datestr(now, 'HH:MM:SS'), msg); set(diag_text, 'String', [current; {new_msg}]); set(diag_text, 'Value', length(get(diag_text, 'String'))); end end % 切换日志显示 function toggle_log(~, ~) if strcmp(get(diag_panel, 'Visible'), 'on') set(diag_panel, 'Visible', 'off'); set(tabgp, 'Position', [0.02, 0.02, 0.96, 0.35]); else set(diag_panel, 'Visible', 'on'); set(tabgp, 'Position', [0.02, 0.38, 0.96, 0.35]); end end % 关闭GUI回调 function close_gui(~, ~) if processing choice = questdlg('处理正在进行中，确定要关闭吗?', '确认关闭', '是', '否', '否'); if strcmp(choice, '否') return; end end stop_audio(); delete(fig); end % 导入源音频 function load_source(~, ~) if processing, return; end [file, path] = uigetfile({'.wav;.mp3;.ogg', '音频文件 (.wav,.mp3,.ogg)'}); if isequal(file, 0), return; end try [audio, fs_in] = audioread(fullfile(path, file)); update_diag(['加载源音频: ', file, ' (', num2str(fs_in), 'Hz)']); set(status_text, 'String', ['正在处理源音频: ', file]); drawnow; % 转换为单声道并归一化 if size(audio, 2) > 1 source_audio = mean(audio, 2); update_diag('转换为单声道'); else source_audio = audio; end source_audio = source_audio / max(abs(source_audio)); update_diag('归一化完成'); % 更新采样率参数 stft_params.fs = fs; % 采样率处理 if fs == 0 fs = fs_in; elseif fs ~= fs_in update_diag(['重采样: ', num2str(fs_in), 'Hz -> ', num2str(fs), 'Hz']); source_audio = resample(source_audio, fs, fs_in); end % 显示波形和频谱 plot(ax3, (0:length(source_audio)-1)/fs, source_audio); title(ax3, ['源音频波形: ', file]); xlabel(ax3, '时间 (s)'); ylabel(ax3, '幅度'); grid(ax3, 'on'); % 显示频谱 show_spectrum(ax4, source_audio, fs, stft_params, '源音频频谱'); set(status_text, 'String', ['已加载源音频: ', file, ' (', num2str(fs/1000), 'kHz)']); update_diag(['源音频长度: ', num2str(length(source_audio)/fs), '秒']); % 重置转换完成标志 conversion_complete = false; catch e errordlg(['加载源音频失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); end end % 导入目标音频 function load_target(~, ~) if processing, return; end [file, path] = uigetfile({'.wav;.mp3;.ogg', '音频文件 (.wav,.mp3,.ogg)'}); if isequal(file, 0), return; end try [audio, fs_in] = audioread(fullfile(path, file)); update_diag(['加载目标音频: ', file, ' (', num2str(fs_in), 'Hz)']); set(status_text, 'String', ['正在处理目标音频: ', file]); drawnow; % 转换为单声道并归一化 if size(audio, 2) > 1 target_audio = mean(audio, 2); update_diag('转换为单声道'); else target_audio = audio; end target_audio = target_audio / max(abs(target_audio)); update_diag('归一化完成'); % 更新采样率参数 stft_params.fs = fs; % 采样率处理 if fs == 0 fs = fs_in; elseif fs ~= fs_in update_diag(['重采样: ', num2str(fs_in), 'Hz -> ', num2str(fs), 'Hz']); target_audio = resample(target_audio, fs, fs_in); end % 显示波形和频谱 plot(ax1, (0:length(target_audio)-1)/fs, target_audio); title(ax1, ['目标音频波形: ', file]); xlabel(ax1, '时间 (s)'); ylabel(ax1, '幅度'); grid(ax1, 'on'); % 显示频谱 show_spectrum(ax5, target_audio, fs, stft_params, '目标音频频谱'); set(status_text, 'String', ['已加载目标音频: ', file, ' (', num2str(fs/1000), 'kHz)']); update_diag(['目标音频长度: ', num2str(length(target_audio)/fs), '秒']); % 重置转换完成标志 conversion_complete = false; catch e errordlg(['加载目标音频失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); end end %% === 在transfer_timbre末尾添加后处理 === function transfer_timbre(~, ~) if processing, return; end if isempty(source_audio) || isempty(target_audio) errordlg('请先导入源音频和目标音频！', '错误'); return; end % 设置处理状态 processing = true; conversion_complete = false; set(status_text, 'String', '开始音色转换...'); update_diag('=== 开始音色转换 ==='); drawnow; % 统一音频长度（以目标音频长度为基准） target_length = length(target_audio); source_length = length(source_audio); if source_length < target_length % 源音频较短，重复填充 num_repeat = ceil(target_length / source_length); extended_source = repmat(source_audio, num_repeat, 1); source_audio_adj = extended_source(1:target_length); update_diag('源音频已扩展以匹配目标长度'); elseif source_length > target_length % 源音频较长，截断 source_audio_adj = source_audio(1:target_length); update_diag('源音频已截断以匹配目标长度'); else source_audio_adj = source_audio; end % 确保长度兼容 target_audio_adj = target_audio(1:min(target_length, length(source_audio_adj))); source_audio_adj = source_audio_adj(1:min(target_length, length(source_audio_adj))); try % === 瞬态检测 === update_diag('检测瞬态区域...'); transients = detect_transients(target_audio_adj, stft_params.win_len, stft_params.hop_size); % === 目标音频STFT === update_diag('对目标音频进行STFT...'); update_progress(0.1, '目标音频STFT'); [mag_target, phase_target] = optimized_stft(target_audio_adj, stft_params, @update_progress); update_diag(sprintf('目标音频STFT完成: %d帧', size(mag_target,2))); % === 源音频STFT === update_diag('对源音频进行STFT...'); update_progress(0.3, '源音频STFT'); [mag_source] = optimized_stft(source_audio_adj, stft_params, @update_progress); update_diag(sprintf('源音频STFT完成: %d帧', size(mag_source,2))); % 确保频谱矩阵大小相同 if size(mag_target, 2) ~= size(mag_source, 2) min_frames = min(size(mag_target, 2), size(mag_source, 2)); mag_target = mag_target(:, 1:min_frames); mag_source = mag_source(:, 1:min_frames); phase_target = phase_target(:, 1:min_frames); update_diag(sprintf('调整频谱帧数: %d帧', min_frames)); end % === 改进的频谱转换算法 === update_diag('应用改进的音色转换算法...'); update_progress(0.65, '频谱转换'); % 1. 计算源音频的频谱包络 mag_source_env = spectral_envelope(mag_source, stft_params.lifter_order, stft_params.nfft); % 2. 计算目标音频的频谱包络 mag_target_env = spectral_envelope(mag_target, stft_params.lifter_order, stft_params.nfft); % 3. 计算源音频的频谱细节（改进方法） mag_source_detail = spectral_detail(mag_source, mag_source_env); % 4. 应用转换：目标包络 + 源细节 mag_new = mag_target_env .* mag_source_detail; % 5. 频谱整形（增强音色特征） mag_new = spectral_shaping(mag_new, mag_source_env, mag_target_env); % % 6. 相位处理（直接使用目标相位） % phase_new = phase_target; % update_diag('使用目标音频相位'); % === 改进的相位处理 === if any(transients) update_diag('瞬态区域使用目标相位'); phase_new = phase_target; % 瞬态区域直接使用目标相位 else update_diag('非瞬态区域重建相位'); phase_new = phase_reconstruction(mag_new, phase_target, stft_params); end % === 重建音频 === update_diag('重建音频(ISTFT)...'); update_progress(0.90, 'ISTFT重建'); converted_audio = optimized_istft(mag_new, phase_new, stft_params, @update_progress); converted_audio = converted_audio / max(abs(converted_audio)); % 归一化 % === 添加后处理 === converted_audio = post_process(converted_audio, fs, source_audio, target_audio); % 确保长度匹配 if length(converted_audio) > target_length converted_audio = converted_audio(1:target_length); elseif length(converted_audio) < target_length converted_audio = [converted_audio; zeros(target_length - length(converted_audio), 1)]; end % 显示结果 plot(ax2, (0:length(converted_audio)-1)/fs, converted_audio); title(ax2, '转换后音频波形'); xlabel(ax2, '时间 (s)'); ylabel(ax2, '幅度'); grid(ax2, 'on'); % 显示转换后频谱 show_spectrum(ax6, converted_audio, fs, stft_params, '转换后频谱'); % 更新状态 update_progress(1.0, '转换完成'); set(status_text, 'String', '音色转换完成！'); update_diag('音色转换成功！', true); % 设置完成标志 conversion_complete = true; % 清理大内存变量 clear mag_target mag_source mag_new; catch e errordlg(['音色转换失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); set(progress_bar, 'FaceColor', [1, 0.3, 0.3]); set(progress_text, 'String', '处理失败'); end % 重置处理状态 processing = false; end %% === 后处理函数 === function y = post_process(x, fs, source, target) % 1. 瞬态增强 y = transient_enhancement(x, fs); % 2. 频谱均衡 if ~isempty(source) && ~isempty(target) y = spectral_eq(y, fs, source, target); end % 3. 动态范围控制 y = dynamic_range_control(y, fs); % 4. 最终归一化 y = y / max(abs(y)); end %% === 瞬态增强函数 === function y = transient_enhancement(x, fs) % 瞬态检测和增强 envelope = abs(hilbert(x)); diff_env = diff(envelope); diff_env = [diff_env(1); diff_env]; threshold = 0.1 * max(abs(diff_env)); transients = abs(diff_env) > threshold; attack_time = 0.005; decay_time = 0.05; attack_samples = round(attack_time * fs); decay_samples = round(decay_time * fs); gain_vector = ones(size(x)); transient_starts = find(diff([0; transients]) == 1); for i = 1:length(transient_starts) start_idx = transient_starts(i); end_idx = min(start_idx + attack_samples + decay_samples, length(x)); attack_phase = linspace(1, 1.8, attack_samples)'; decay_phase = linspace(1.8, 1, decay_samples)'; full_phase = [attack_phase; decay_phase]; valid_length = min(length(full_phase), end_idx - start_idx); gain_vector(start_idx:start_idx+valid_length-1) = full_phase(1:valid_length); end y = x .* gain_vector; y = y / max(abs(y)) * 0.98; limiter_threshold = 0.95; y(y > limiter_threshold) = limiter_threshold; y(y < -limiter_threshold) = -limiter_threshold; end %% === 修正后的频谱均衡函数 === function y = spectral_eq(y, fs, source, target) % 基于源和目标频谱特性的均衡 (绕过graphicEQ) % 1. 频谱分析参数 window_size = 1024; overlap = 512; nfft = 1024; % 2. 计算源音频平均频谱 [~, F, ~, Pxx_source] = spectrogram(... source, hamming(window_size), overlap, nfft, fs, 'power'); Pxx_source = mean(Pxx_source, 2); % 3. 计算目标音频平均频谱 [~, ~, ~, Pxx_target] = spectrogram(... target, hamming(window_size), overlap, nfft, fs, 'power'); Pxx_target = mean(Pxx_target, 2); % 4. 计算期望增益 (幅度比) desired_gain = sqrt(Pxx_target ./ (Pxx_source + eps)); gains_db = 20log10(desired_gain); % 转换为dB % 5. 设计均衡滤波器组 (绕过graphicEQ) eq_filter = design_robust_eq(F, gains_db, fs); % 6. 应用均衡 y = filter(eq_filter, y); end %% 鲁棒的均衡滤波器设计函数 function Hd = design_robust_eq(F, gains_db, fs) % 方法1：尝试使用designfilt的graphiceq选项 try Hd = designfilt('graphiceq', ... 'SampleRate', fs, ... 'Frequencies', F, ... 'Gains', gains_db, ... 'Bandwidth', 1/3); % 1/3倍频程带宽 return; catch end % 方法2：手动创建级联滤波器组 try % 初始化滤波器集合 filters = {}; % 为每个频点设计峰值滤波器 for i = 1:length(F) % 跳过无效频点 if F(i) <= 0 || F(i) >= fs/2 continue; end % 设计峰值滤波器 [b, a] = peaking_filter(F(i), gains_db(i), fs); % 添加到滤波器集合 filters{end+1} = dfilt.df2(b, a); %#ok<AGROW> end % 级联所有滤波器 Hd = dfilt.cascade(filters{:}); return; catch end % 方法3：基础FIR均衡器 (最后防线) try % 创建目标频率响应 freq_vector = linspace(0, fs/2, 1024); gain_vector = interp1(F, gains_db, freq_vector, 'pchip', 'extrap'); mag_response = 10.^(gain_vector/20); % dB转幅度 % 设计FIR滤波器 Hd = designfilt('arbmagfir', ... 'FilterOrder', 256, ... 'Frequencies', freq_vector, ... 'Amplitudes', mag_response, ... 'SampleRate', fs); return; catch end % 所有方法均失败时返回直通滤波器 warning('所有均衡器设计方法均失败，返回直通滤波器'); Hd = dfilt.dffir(1); % 增益为1的FIR滤波器 end %% 峰值滤波器设计函数 function [b, a] = peaking_filter(fc, G, fs, Q) % 参数默认值 if nargin < 4 Q = 1.5; % 默认品质因数 end % 标准化频率 w0 = 2pifc/fs; alpha = sin(w0)/(2Q); % 增益线性转换 A = 10^(G/40); % 滤波器系数 b0 = 1 + alphaA; b1 = -2cos(w0); b2 = 1 - alphaA; a0 = 1 + alpha/A; a1 = -2cos(w0); a2 = 1 - alpha/A; % 归一化系数 b = [b0, b1, b2] / a0; a = [a0, a1, a2] / a0; end %% === 动态范围控制函数 === function y = dynamic_range_control(y, fs) compressor1 = compressor(... 'SampleRate', fs, ... 'Threshold', -20, ... 'Ratio', 2, ... 'KneeWidth', 6, ... 'AttackTime', 0.02, ... 'ReleaseTime', 0.1, ... 'MakeUpGainMode', 'Auto'); y = compressor1(y); end %% === 瞬态检测函数 === function transients = detect_transients(audio, win_len, hop_size) % 基于能量变化的瞬态检测 num_frames = floor((length(audio)-win_len)/hop_size) + 1; transients = false(num_frames, 1); prev_energy = 0; threshold = 0.3; % 能量变化阈值 for i = 1:num_frames start_idx = (i-1)hop_size + 1; end_idx = start_idx + win_len - 1; frame = audio(start_idx:end_idx); frame_energy = sum(frame.^2); if i > 1 energy_diff = (frame_energy - prev_energy) / prev_energy; if energy_diff > threshold transients(i) = true; end end prev_energy = frame_energy; end end %% === 新增相位重建函数 === function phase = phase_reconstruction(mag, init_phase, params) % Griffin-Lim相位重建算法 num_iter = params.phase_iter; [num_bins, ~] = size(mag); current_phase = init_phase; for iter = 1:num_iter % 1. 创建复数频谱 S_half = mag . exp(1icurrent_phase); % 2. 重建时域信号 x_recon = optimized_istft(mag, current_phase, params, []); % 3. 重新计算STFT [~, new_phase] = optimized_stft(x_recon, params, []); % 4. 更新相位 current_phase = new_phase; end phase = current_phase; end % 替换原来的 spectral_envelope 函数 %% === 改进的频谱包络提取函数 === function env = spectral_envelope(mag, lifter_order, nfft) % 使用倒谱分析提取更准确的包络 [num_bins, num_frames] = size(mag); env = zeros(size(mag)); for i = 1:num_frames % 1. 对数幅度谱 log_mag = log(mag(:, i) + eps); % 2. 计算倒谱 cepstrum = real(ifft(log_mag, nfft)); % 3. 提升：保留低频部分（包络） cepstrum(lifter_order+1:end-lifter_order+1) = 0; % 4. 重建包络 env_frame = real(fft(cepstrum, nfft)); env(:, i) = exp(env_frame(1:num_bins)); end % 5. 时域平滑 env = movmean(env, 3, 2); end %% === 修改播放按钮回调 === function play_target_audio(~, ~) % 禁用播放按钮避免重复点击 set(handles.play_target_btn, 'Enable', 'off'); drawnow; try if isempty(target_audio) errordlg('目标音频为空！请先导入目标音频。', '播放错误'); return; end % 在后台线程中播放 parfeval(@() play_audio(target_audio, fs), 0); catch e errordlg(['播放失败: ' e.message], '播放错误'); end % 延迟后重新启用按钮 pause(1); % 防止立即重复点击 set(handles.play_target_btn, 'Enable', 'on'); end %% === 修改播放按钮回调 === function play_converted_audio(~, ~) % 禁用播放按钮避免重复点击 set(handles.play_target_btn, 'Enable', 'off'); drawnow; try if isempty(converted_audio) errordlg('目标音频为空！请先导入目标音频。', '播放错误'); return; end % 在后台线程中播放 parfeval(@() play_audio(converted_audio, fs), 0); catch e errordlg(['播放失败: ' e.message], '播放错误'); end % 延迟后重新启用按钮 pause(1); % 防止立即重复点击 set(handles.play_target_btn, 'Enable', 'on'); end % 进度更新函数 function update_progress(progress, message) if nargin >= 1 set(progress_bar, 'XData', [0, progress, progress, 0]); end if nargin >= 2 set(progress_text, 'String', message); set(status_text, 'String', message); end if nargin == 1 set(progress_text, 'String', sprintf('%.0f%%', progress100)); end % 强制刷新界面 drawnow limitrate; end %% === 修改 play_audio 函数 === function play_audio(audio, fs) % 使用持久变量存储播放器状态 persistent player persistent is_playing % 初始化状态 if isempty(is_playing) is_playing = false; end % === 停止当前播放 === if is_playing && ~isempty(player) && isplaying(player) stop(player); is_playing = false; update_diag('停止当前播放'); end % === 增强音频验证 === if isempty(audio) || all(audio == 0) errordlg('无效的音频数据!', '播放错误'); update_diag('播放失败: 无效音频数据', true); return; end % 检查音频数据范围 if max(abs(audio)) > 1.5 audio = audio / max(abs(audio)); update_diag('音频已归一化', false); end % === 异步播放实现 === try % 确保使用列向量 if ~iscolumn(audio) audio = audio(:); end % 创建新播放器对象 player = audioplayer(audio, fs); % 设置回调函数 set(player, 'StartFcn', @play_start_callback); set(player, 'StopFcn', @play_stop_callback); set(player, 'TimerFcn', @play_progress_callback); set(player, 'TimerPeriod', 0.1); % 100ms更新一次 % 异步播放 play(player); is_playing = true; catch e % 错误处理 errordlg(['播放失败: ' e.message], '播放错误'); update_diag(['播放错误: ' e.message], true); % 尝试系统命令播放 try_system_play(audio, fs); end end %% === 播放回调函数 === function play_start_callback(obj, ~) % 播放开始回调 duration = obj.TotalSamples / obj.SampleRate; status_msg = sprintf('开始播放 (%.1f秒)', duration); set(status_text, 'String', status_msg); update_diag(status_msg, false); end function play_stop_callback(~, ~) % 播放结束回调 set(status_text, 'String', '播放完成'); update_diag('播放完成', false); end function play_progress_callback(obj, ~) % 播放进度回调 if isplaying(obj) current = obj.CurrentSample; total = obj.TotalSamples; fs = obj.SampleRate; elapsed = current / fs; total_time = total / fs; progress = current / total; set(progress_bar, 'XData', [0, progress, progress, 0]); set(progress_text, 'String', sprintf('播放进度: %.0f%%', progress100)); status = sprintf('播放中: %.1f/%.1f秒', elapsed, total_time); set(status_text, 'String', status); end end %% === 备用播放方案 === function try_system_play(audio, fs) % 当audioplayer失败时使用系统命令播放 try % 保存临时音频文件 temp_file = [tempname '.wav']; audiowrite(temp_file, audio, fs); % 跨平台播放命令 if ispc % Windows系统 system(['start "" "' temp_file '"']); elseif ismac % macOS系统 system(['afplay "' temp_file '" &']); else % Linux系统 system(['aplay "' temp_file '" &']); end update_diag(['使用系统命令播放: ' temp_file], true); set(status_text, 'String', '使用外部播放器播放音频'); catch e errordlg(['备用播放失败: ' e.message], '播放错误'); update_diag(['备用播放失败: ' e.message], true); end end function stop_audio(~, ~) try % 获取当前所有播放器对象 allPlayers = findall(0, 'Type', 'audioplayer'); % 停止并删除所有播放器 for i = 1:length(allPlayers) if isplaying(allPlayers(i)) stop(allPlayers(i)); end delete(allPlayers(i)); end set(status_text, 'String', '播放已停止'); update_diag('音频播放已停止', true); catch e errordlg(['停止播放失败: ', e.message], '错误'); update_diag(['停止播放错误: ', e.message], true); end end % 保存音频函数 function save_audio(~, ~) if processing errordlg('处理正在进行中，请稍后保存', '错误'); return; end if isempty(converted_audio) errordlg('没有转换后的音频可保存！', '错误'); return; end [file, path] = uiputfile('.wav', '保存转换音频'); if isequal(file, 0), return; end set(status_text, 'String', '正在保存音频...'); update_diag(['开始保存: ', file], true); try % 直接保存音频 filename = fullfile(path, file); audiowrite(filename, converted_audio, fs); set(status_text, 'String', ['已保存: ', file]); update_diag(['音频已保存: ', filename], true); catch e errordlg(['保存失败: ', e.message], '极错误'); update_diag(['保存错误: ', e.message], true); end end function show_spectrum(ax, audio, fs, params, title_str) try % 检查输入音频 if isempty(audio) || length(audio) < params.win_len error('无效音频数据: 长度=%d, 需要≥%d', length(audio), params.win_len); end % 计算STFT [~, ~, f, t] = optimized_stft(audio, params, []); % 直接使用optimized_stft的维度验证 [mag, ~, f, t] = optimized_stft(audio, params, []); spec_data = 10log10(abs(mag) + eps); % 绘图 cla(ax); imagesc(ax, t, f, spec_data); % 坐标轴设置 set(ax, 'YDir', 'normal'); axis(ax, 'tight'); ylim(ax, [0, fs/2]); % 限制到Nyquist频率 title(ax, [title_str, sprintf(' (%.1f秒)', length(audio)/fs)]); xlabel(ax, '时间 (s)'); ylabel(ax, '频率 (Hz)'); colorbar(ax); colormap(ax, 'jet'); catch e % 错误处理 cla(ax); err_msg = sprintf('频谱错误: %s\n音频尺寸: %dx%d', e.message, size(audio,1), size(audio,2)); text(ax, 0.5, 0.5, err_msg, ... 'HorizontalAlignment', 'center', ... 'Color', 'red', ... 'FontSize', 9); title(ax, [title_str, ' (错误)']); end end end %% === 单帧相位重建函数 === function phase_frame = frame_phase_reconstruction(mag_frame, init_phase, params) % 基于MAGNA方法的快速单帧相位重建 num_iter = 3; % 减少迭代次数 nfft = params.nfft; phase_frame = init_phase; % 创建初始复数频谱 S_half = mag_frame . exp(1iphase_frame); % 创建完整频谱 S_full = zeros(nfft, 1); S_full(1:length(mag_frame)) = S_half; S_full(end-length(mag_frame)+2:end) = conj(S_half(end-1:-1:2)); for iter = 1:num_iter % 1. 逆FFT frame = real(ifft(S_full)); % 2. 正向FFT S_full_new = fft(frame, nfft); % 3. 保持幅度，更新相位 S_full_new = mag_frame . exp(1iangle(S_full_new(1:length(mag_frame)))); % 4. 重建完整频谱 S_full(1:length(mag_frame)) = S_full_new; S_full(end-length(mag_frame)+2:end) = conj(S_full_new(end-1:-1:2)); end phase_frame = angle(S_full_new); end %% === 改进的频谱细节函数 === function detail = spectral_detail(mag, env) % 更自然的细节提取 alpha = 0.5; % 细节增强因子 beta = 0.1; % 噪声抑制因子 % 基础细节计算 base_detail = mag ./ (env + eps); % 应用压缩函数避免极端值 detail = tanh(alpha base_detail) / tanh(alpha); % 频域平滑 detail = imgaussfilt(detail, 1.5); end %% === 改进的频谱整形 === function mag_out = spectral_shaping(mag, env_source, env_target) % 更保守的频谱整形 balance_factor = 0.5; % 降低源音色特征强度 % 计算频谱比例因子（带限） freq_range = 1:min(100, size(mag,1)); % 只影响低频区域 ratio = ones(size(mag)); ratio(freq_range, :) = (env_source(freq_range, :) ./ env_target(freq_range, :)).^balance_factor; % 限制比例范围 ratio = min(max(ratio, 0.8), 1.2); % 应用比例因子 mag_out = mag .* ratio; end %% === 核心音频处理函数 === function [mag, phase, f, t] = optimized_stft(x, params, progress_callback) % 参数提取 win_len = params.win_len; hop_size = params.hop_size; nfft = params.nfft; fs = params.fs; % 输入验证 if isempty(x) || length(x) < win_len error('无效输入: 信号长度(%d) < 窗长(%d)', length(x), win_len); end % 创建窗函数 win = hann(win_len, 'periodic'); % 计算帧数 num_frames = floor((length(x) - win_len) / hop_size) + 1; % 初始化STFT矩阵 stft_matrix = zeros(nfft, num_frames); % === 关键修复: 正确的时间向量计算 === % 每帧的中心时间点 (秒) t = ((0:num_frames-1) * hop_size + win_len/2) / fs; % 进行STFT for i = 1:num_frames start_idx = (i-1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, length(x)); segment = x(start_idx:end_idx); % 零填充短于窗长的段 if length(segment) < win_len segment = [segment; zeros(win_len - length(segment), 1)]; end segment = segment .* win; X = fft(segment, nfft); stft_matrix(:, i) = X; % 进度更新 if ~isempty(progress_callback) progress = i / num_frames; progress_callback(progress); end end % 取单边频谱 num_freq_bins = floor(nfft/2) + 1; stft_matrix = stft_matrix(1:num_freq_bins, :); % 计算幅度和相位 mag = abs(stft_matrix); phase = angle(stft_matrix); % 频率向量 (Hz) f = (0:num_freq_bins-1)' * (fs / nfft); % === 维度验证 === assert(size(mag, 1) == length(f), ... '频率维度不匹配: mag行数=%d, f长度=%d', size(mag,1), length(f)); assert(size(mag, 2) == length(t), ... '时间维度不匹配: mag列数=%d, t长度=%d', size(mag,2), length(t)); end function x_recon = optimized_istft(mag, phase, params, progress_callback) % 优化的逆短时傅里叶变换(ISTFT)实现 % 输入: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % params - 参数结构体 % progress_callback - 进度回调函数 % 输出: % x_recon - 重建的时域信号 % === 输入验证增强 === if isempty(mag) || isempty(phase) error('ISTFT输入为空'); end % === 参数提取 === nfft = params.nfft; win_len = params.win_len; hop_size = win_len - params.overlap; win_synth = params.win_synthesis; [num_bins, num_frames] = size(mag); % 计算信号总长度 total_samples = (num_frames - 1) * hop_size + win_len; x_recon = zeros(total_samples, 1); % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 重建复数频谱 === S_half = mag .* exp(1i * phase); % === 创建双边谱 === S_full = zeros(nfft, num_frames); if rem(nfft, 2) % 奇数点FFT S_full(1:num_bins, :) = S_half; S_full(num_bins+1:end, :) = conj(S_half(end:-1:2, :)); else % 偶数点FFT S_full(1:num_bins, :) = S_half; % 注意：Nyquist点处理 S_full(num_bins+1:end, :) = conj(S_half(end-1:-1:2, :)); end % === 执行逆FFT和重叠相加 === for frame_idx = 1:num_frames % 1. 逆FFT frame = real(ifft(S_full(:, frame_idx), nfft)); % 2. 应用合成窗 frame_win = frame(1:win_len) .* win_synth; % 3. 计算位置并叠加 start_idx = (frame_idx - 1) * hop_size + 1; end_idx = start_idx + win_len - 1; % 确保不越界 if end_idx > total_samples end_idx = total_samples; frame_win = frame_win(1:(end_idx - start_idx + 1)); end % 重叠相加 x_recon(start_idx:end_idx) = x_recon(start_idx:end_idx) + frame_win; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('ISTFT重建: %d/%d', frame_idx, num_frames)); end end % === 归一化处理 === % 计算重叠因子 overlap_factor = win_len / hop_size; % 计算归一化窗口 norm_win = zeros(total_samples, 1); for i = 1:num_frames start_idx = (i - 1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, total_samples); norm_win(start_idx:end_idx) = norm_win(start_idx:end_idx) + win_synth(1:(end_idx-start_idx+1)).^2; end % 避免除以零 norm_win(norm_win < eps) = eps; % 应用归一化 x_recon = x_recon ./ norm_win; end 在这个代码中，运行时总是卡在ISTFT中，每次卡的进度值不一样，请修改

另一个潜在问题：在`optimized_istft`中，重建的双边谱`S_full`的大小为`nfftxnum_frames`，而逆FFT后，我们取`frame(1:win_len)`。如果`win_len`大于`nfft`，则会出现索引错误。但我们的设置中`win_len`等于`nfft`...

% 简化版频谱显示函数（不再包含包络计算） function show_spectrum(ax, audio, fs, params, title_str) try % 计算STFT [S, f, t] = optimized_stft(audio, params, []); % 处理频谱数据 mag = abs(S); spec_data = 10log10(mag + eps); % === 增强的维度验证 === % 确保频率向量是列向量 if ~iscolumn(f) f = f(:); end % 确保时间向量是行向量 if ~isrow(t) t = t(:)'; end % === 维度一致性检查 === if size(spec_data, 1) ~= length(f) || size(spec_data, 2) ~= length(t) min_bins = min(size(spec_data, 1), length(f)); min_frames = min(size(spec_data, 2), length(t)); spec_data = spec_data(1:min_bins, 1:min_frames); f = f(1:min_bins); t = t(1:min_frames); update_diag(sprintf('维度自动调整: spec_data(%d×%d), f(%d), t(%d)',... size(spec_data,1), size(spec_data,2), length(f), length(t))); end % === 坐标值验证 === % 确保频率在合理范围内 nyquist = fs/2; if any(f > nyquist) f(f > nyquist) = nyquist; end % 清除旧内容 cla(ax); % === 绘制频谱图 === imagesc(ax, t, f, spec_data); % === 设置坐标轴属性 === set(ax, 'YDir', 'normal'); % 低频在底部 axis(ax, 'tight'); % 自动调整坐标轴范围 % === 设置显示属性 === title(ax, title_str); xlabel(ax, '时间 (s)'); ylabel(ax, '频率 (Hz)'); colorbar(ax); colormap(ax, 'jet'); % 设置频率范围 max_freq = min(fs/2, max(f)); ylim(ax, [min(f), max_freq]); % 添加诊断信息 update_diag(sprintf('频谱显示成功: %s (尺寸: %d×%d)', title_str, size(spec_data,1), size(spec_data,2))); catch e % 详细的错误信息 dim_info = sprintf('维度: spec_data(%d×%d), f(%d), t(%d)',... size(spec_data,1), size(spec_data,2), length(f), length(t)); err_msg = sprintf('频谱错误: %s\n%s', e.message, dim_info); % 显示错误信息 cla(ax); text(ax, 0.5, 0.5, err_msg, ... 'HorizontalAlignment', 'center', ... 'FontSize', 8, 'Color', 'red'); title(ax, [title_str, ' (错误)']); % 在诊断信息中记录详细错误 update_diag(['频谱显示错误: ' err_msg], true); end end end %% === 核心音频处理函数 === function [mag, phase, f, t] = optimized_stft(x, params, progress_callback) % 优化的短时傅里叶变换(STFT)实现 % 输入: % x - 时域信号 % params - 参数结构体 (包含窗长、重叠、FFT点数、窗函数等) % progress_callback - 进度回调函数 % 输出: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % f - 频率向量 (Hz) % t - 时间向量 (秒) % === 参数提取 === win_len = params.win_len; overlap = params.overlap; nfft = params.nfft; win_anal = params.window; fs = params.fs; hop_size = win_len - overlap; % === 计算STFT的帧数 === num_samples = length(x); num_frames = floor((num_samples - overlap) / hop_size); % === 初始化STFT矩阵 === S = zeros(nfft, num_frames); % 完整的双边谱 % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 分帧处理 === for frame_idx = 1:num_frames % 1. 计算当前帧的起始和结束索引 start_idx = (frame_idx - 1) hop_size + 1; end_idx = start_idx + win_len - 1; % 边界处理：如果最后一帧超出信号长度，则截断 if end_idx > num_samples frame = [x(start_idx:end); zeros(end_idx - num_samples, 1)]; else frame = x(start_idx:end_idx); end % 2. 加窗 frame_win = frame .* win_anal; % 3. FFT S_frame = fft(frame_win, nfft); S(:, frame_idx) = S_frame; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('STFT计算: %d/%d', frame_idx, num_frames)); end end % === 计算单边谱 === if rem(nfft, 2) % 奇数点FFT num_bins = (nfft+1)/2; else num_bins = nfft/2 + 1; end S_half = S(1:num_bins, :); % 单边谱 % 幅度和相位 mag = abs(S_half); phase = angle(S_half); % 频率向量 f = (0:num_bins-1) * fs / nfft; % 时间向量 t = (0:num_frames-1) * hop_size / fs; end function x_recon = optimized_istft(mag, phase, params, progress_callback) % 优化的逆短时傅里叶变换(ISTFT)实现 % 输入: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % params - 参数结构体 % progress_callback - 进度回调函数 % 输出: % x_recon - 重建的时域信号 % === 参数提取 === nfft = params.nfft; win_len = params.win_len; hop_size = win_len - params.overlap; win_synth = params.win_synthesis; [num_bins, num_frames] = size(mag); % 计算信号总长度 total_samples = (num_frames - 1) * hop_size + win_len; x_recon = zeros(total_samples, 1); % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 重建复数频谱 === S_half = mag .* exp(1i * phase); % === 创建双边谱 === S_full = zeros(nfft, num_frames); if rem(nfft, 2) % 奇数点FFT S_full(1:num_bins, :) = S_half; S_full(num_bins+1:end, :) = conj(S_half(end:-1:2, :)); else % 偶数点FFT S_full(1:num_bins, :) = S_half; % 注意：Nyquist点处理 S_full(num_bins+1:end, :) = conj(S_half(end-1:-1:2, :)); end % === 执行逆FFT和重叠相加 === for frame_idx = 1:num_frames % 1. 逆FFT frame = real(ifft(S_full(:, frame_idx), nfft)); % 2. 应用合成窗 frame_win = frame(1:win_len) .* win_synth; % 3. 计算位置并叠加 start_idx = (frame_idx - 1) * hop_size + 1; end_idx = start_idx + win_len - 1; % 确保不越界 if end_idx > total_samples end_idx = total_samples; frame_win = frame_win(1:(end_idx - start_idx + 1)); end % 重叠相加 x_recon(start_idx:end_idx) = x_recon(start_idx:end_idx) + frame_win; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('ISTFT重建: %d/%d', frame_idx, num_frames)); end end % === 归一化处理 === % 计算重叠因子 overlap_factor = win_len / hop_size; % 计算归一化窗口 norm_win = zeros(total_samples, 1); for i = 1:num_frames start_idx = (i - 1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, total_samples); norm_win(start_idx:end_idx) = norm_win(start_idx:end_idx) + win_synth(1:(end_idx-start_idx+1)).^2; end % 避免除以零 norm_win(norm_win < eps) = eps; % 应用归一化 x_recon = x_recon ./ norm_win; end 刚才理解错了我的意思，这是我原来的代码，之前的对话中让你重写了完整代码，但回答被终止在显示函数了，请根据我提供的代码，修正之后给出显示函数之后所有的完整代码，包含所有的部分

function env = spectral_envelope(mag, lifter_order, nfft) % ... [保持原始实现] ... end % 进度更新函数 function update_progress(progress, message) % ... [保持原始实现] ... end % 播放音频函数 ...

function audio_pitch_correction % 创建主GUI界面 fig = uifigure('Name', '音频音准矫正系统', 'Position', [100 100 900 700]); % 创建音频选择区域 uilabel(fig, 'Position', [50 680 300 20], 'Text', '待矫正音频来源:', 'FontWeight', 'bold'); % 创建录音选项按钮组 source_btn_group = uibuttongroup(fig, 'Position', [50 630 300 40], 'Title', ''); uibutton(source_btn_group, 'Position', [10 10 130 30], 'Text', '导入音频文件', ... 'ButtonPushedFcn', @(btn,event) select_audio(fig, 'source')); uibutton(source_btn_group, 'Position', [160 10 130 30], 'Text', '录制音频', ... 'ButtonPushedFcn', @(btn,event) record_audio(fig)); % 创建参考音频选择按钮 uilabel(fig, 'Position', [400 680 300 20], 'Text', '参考音频来源:', 'FontWeight', 'bold'); uibutton(fig, 'Position', [400 630 150 30], 'Text', '导入参考音频', ... 'ButtonPushedFcn', @(btn,event) select_audio(fig, 'reference')); % 创建处理按钮 process_btn = uibutton(fig, 'Position', [600 630 150 30], ... 'Text', '开始矫正', 'Enable', 'off', ... 'ButtonPushedFcn', @(btn,event) process_audio(fig)); % 创建播放和保存按钮 uibutton(fig, 'Position', [50 580 150 30], 'Text', '播放原始音频', ... 'ButtonPushedFcn', @(btn,event) play_audio(fig, 'source')); uibutton(fig, 'Position', [250 580 150 30], 'Text', '播放矫正音频', ... 'ButtonPushedFcn', @(btn,event) play_audio(fig, 'corrected')); uibutton(fig, 'Position', [450 580 150 30], 'Text', '保存矫正音频', ... 'ButtonPushedFcn', @(btn,event) save_audio(fig)); % 创建录音状态显示 recording_label = uilabel(fig, 'Position', [650 580 200 30], ... 'Text', '准备录音', 'FontColor', [0 0.5 0]); % 创建波形显示区域 ax_source = uiaxes(fig, 'Position', [50 350 800 150]); title(ax_source, '待矫正音频波形'); ax_reference = uiaxes(fig, 'Position', [50 180 800 150]); title(ax_reference, '参考音频波形'); ax_corrected = uiaxes(fig, 'Position', [50 10 800 150]); title(ax_corrected, '矫正后音频波形'); % 存储数据 fig.UserData.source_audio = []; fig.UserData.reference_audio = []; fig.UserData.corrected_audio = []; fig.UserData.fs = 44100; % 默认采样率 fig.UserData.process_btn = process_btn; fig.UserData.axes = struct('source', ax_source, 'reference', ax_reference, 'corrected', ax_corrected); fig.UserData.recording_label = recording_label; fig.UserData.recorder = []; % 录音器对象 fig.UserData.timer = []; % 计时器对象 end function select_audio(fig, audio_type) [file, path] = uigetfile({'.wav;.mp3;.ogg;.flac', ... '音频文件 (.wav,.mp3,.ogg,.flac)'}); if isequal(file, 0) return; end filename = fullfile(path, file); [audio, fs] = audioread(filename); % 处理立体声：转换为单声道 if size(audio, 2) > 1 audio = mean(audio, 2); end % 截取前20秒 max_samples = min(20fs, length(audio)); audio = audio(1:max_samples); % 存储数据 fig.UserData.([audio_type '_audio']) = audio; fig.UserData.fs = fs; % 更新波形显示 ax = fig.UserData.axes.(audio_type); plot(ax, (1:length(audio))/fs, audio); xlabel(ax, '时间 (s)'); ylabel(ax, '幅度'); % 启用处理按钮 if ~isempty(fig.UserData.source_audio) && ~isempty(fig.UserData.reference_audio) fig.UserData.process_btn.Enable = 'on'; end end function record_audio(fig) % 创建录音界面 record_fig = uifigure('Name', '音频录制', 'Position', [300 300 400 200]); % 录音时长设置 uilabel(record_fig, 'Position', [50 150 100 20], 'Text', '录音时长 (秒):'); duration_edit = uieditfield(record_fig, 'numeric', ... 'Position', [160 150 100 20], 'Value', 5, 'Limits', [1 30]); % 采样率设置 uilabel(record_fig, 'Position', [50 120 100 20], 'Text', '采样率:'); fs_dropdown = uidropdown(record_fig, ... 'Position', [160 120 100 20], ... 'Items', {'8000', '16000', '44100', '48000'}, ... 'Value', '44100'); % 控制按钮 record_btn = uibutton(record_fig, 'Position', [50 70 100 30], ... 'Text', '开始录音', ... 'ButtonPushedFcn', @(btn,event) start_recording(fig, duration_edit.Value, str2double(fs_dropdown.Value))); uibutton(record_fig, 'Position', [160 70 100 30], ... 'Text', '停止录音', ... 'ButtonPushedFcn', @(btn,event) stop_recording(fig)); uibutton(record_fig, 'Position', [270 70 100 30], ... 'Text', '关闭', ... 'ButtonPushedFcn', @(btn,event) close(record_fig)); end function start_recording(fig, duration, fs) % 更新状态 fig.UserData.recording_label.Text = '录音中...'; fig.UserData.recording_label.FontColor = [1 0 0]; drawnow; % 创建录音器对象 recorder = audiorecorder(fs, 16, 1); % 16-bit, 单声道 % 设置录音时长 fig.UserData.recorder = recorder; fig.UserData.fs = fs; % 开始录音 record(recorder, duration); % 创建计时器显示剩余时间 t = timer('ExecutionMode', 'fixedRate', 'Period', 1, ... 'TasksToExecute', duration, ... 'TimerFcn', @(t,~) update_recording_timer(fig, t, duration)); start(t); % 存储计时器 fig.UserData.timer = t; end function update_recording_timer(fig, t, total_duration) elapsed = t.TasksExecuted; remaining = total_duration - elapsed; fig.UserData.recording_label.Text = sprintf('录音中: %d秒', remaining); % 录音结束时自动停止 if remaining <= 0 stop_recording(fig); end end function stop_recording(fig) if ~isempty(fig.UserData.recorder) && isrecording(fig.UserData.recorder) stop(fig.UserData.recorder); end % 停止计时器 if ~isempty(fig.UserData.timer) && isvalid(fig.UserData.timer) stop(fig.UserData.timer); delete(fig.UserData.timer); fig.UserData.timer = []; end % 获取录音数据 audio = getaudiodata(fig.UserData.recorder); fs = fig.UserData.fs; % 更新状态 fig.UserData.recording_label.Text = '录音完成!'; fig.UserData.recording_label.FontColor = [0 0.5 0]; % 存储为待矫正音频 fig.UserData.source_audio = audio; % 更新波形显示 ax = fig.UserData.axes.source; plot(ax, (1:length(audio))/fs, audio); title(ax, '录制音频波形'); xlabel(ax, '时间 (s)'); ylabel(ax, '幅度'); % 启用处理按钮 if ~isempty(fig.UserData.reference_audio) fig.UserData.process_btn.Enable = 'on'; end end function process_audio(fig) source = fig.UserData.source_audio; reference = fig.UserData.reference_audio; fs = fig.UserData.fs; % 确保主图窗存在 if ~isvalid(fig) errordlg('主窗口已关闭，无法处理音频!', '处理错误'); return; end % 创建处理进度对话框 h = uiprogressdlg(fig, 'Title', '处理中', 'Message', '音频对齐...', 'Indeterminate', 'on'); % 步骤1：音频对齐 try [aligned_source, aligned_ref] = improved_align_audio(source, reference, fs); catch ME close(h); errordlg(['音频对齐失败: ' ME.message], '处理错误'); return; end % 步骤2：基频提取 h.Message = '提取音高...'; try [f0_source, time_source] = extract_pitch(aligned_source, fs); [f0_ref, time_ref] = extract_pitch(aligned_ref, fs); catch ME close(h); errordlg(['音高提取失败: ' ME.message], '处理错误'); return; end % 步骤3：音调矫正 h.Message = '矫正音调...'; try [corrected, f0_corrected] = correct_pitch(fig, aligned_source, fs, f0_source, f0_ref, time_source, time_ref); catch ME close(h); errordlg(['音高校正失败: ' ME.message], '处理错误'); return; end % 关闭进度对话框 close(h); % === 关键修复 1: 存储矫正结果 === fig.UserData.corrected_audio = corrected; % === 关键修复 2: 更新播放按钮状态 === play_btn = findobj(fig, 'Text', '播放矫正音频'); if ~isempty(play_btn) play_btn.Enable = 'on'; end % 保存结果并更新显示 % 更新原始音频波形图（添加音高曲线） ax_src = fig.UserData.axes.source; cla(ax_src); yyaxis(ax_src, 'left'); plot(ax_src, (1:length(aligned_source))/fs, aligned_source, 'b'); ylabel(ax_src, '幅度'); yyaxis(ax_src, 'right'); plot(ax_src, time_source, f0_source, 'r', 'LineWidth', 1.5); ylabel(ax_src, '频率 (Hz)'); title(ax_src, '原始音频波形与音高'); grid(ax_src, 'on'); % 更新参考音频波形图（添加音高曲线） ax_ref = fig.UserData.axes.reference; cla(ax_ref); yyaxis(ax_ref, 'left'); plot(ax_ref, (1:length(aligned_ref))/fs, aligned_ref, 'g'); ylabel(ax_ref, '幅度'); yyaxis(ax_ref, 'right'); plot(ax_ref, time_ref, f0_ref, 'm', 'LineWidth', 1.5); ylabel(ax_ref, '频率 (Hz)'); title(ax_ref, '参考音频波形与音高'); grid(ax_ref, 'on'); % 更新矫正后音频波形图（添加音高曲线） ax_corr = fig.UserData.axes.corrected; cla(ax_corr); yyaxis(ax_corr, 'left'); plot(ax_corr, (1:length(corrected))/fs, corrected, 'Color', [0.5 0 0.5]); ylabel(ax_corr, '幅度'); yyaxis(ax_corr, 'right'); plot(ax_corr, time_source, f0_corrected, 'Color', [1 0.5 0], 'LineWidth', 2); ylabel(ax_corr, '频率 (Hz)'); title(ax_corr, '矫正后音频波形与音高'); grid(ax_corr, 'on'); % 绘制综合音高对比图 % 修改后的调用：添加音频波形参数 plot_pitch_comparison(time_source, f0_source, time_ref, f0_ref, f0_corrected,... aligned_source, aligned_ref, corrected, fs); fprintf('原始音高平均: %.1f Hz\n', mean(f0_source(f0_source>0))); fprintf('参考音高平均: %.1f Hz\n', mean(f0_ref(f0_ref>0))); fprintf('矫正后音高平均: %.1f Hz\n', mean(f0_corrected(f0_corrected>0))); end function [aligned_src, aligned_ref] = improved_align_audio(src, ref, fs) % 改进的音频对齐方法：使用频谱互相关 win_size = round(0.1 fs); % 100ms窗口 hop_size = round(0.05 * fs); % 50ms跳跃 % 计算源音频的频谱图 [S_src, ~, t_src] = spectrogram(src, win_size, win_size-hop_size, win_size, fs); % 计算参考音频的频谱图 [S_ref, ~, t_ref] = spectrogram(ref, win_size, win_size-hop_size, win_size, fs); % 计算互相关 n_frames = min(length(t_src), length(t_ref)); corr_vals = zeros(1, n_frames); for i = 1:n_frames spec_src = abs(S_src(:, i)); spec_ref = abs(S_ref(:, i)); corr_vals(i) = dot(spec_src, spec_ref) / (norm(spec_src) * norm(spec_ref)); end % 找到最大相关帧 [~, max_idx] = max(corr_vals); time_diff = t_src(max_idx) - t_ref(max_idx); sample_diff = round(time_diff * fs); % 对齐音频 if sample_diff > 0 aligned_src = src(1:end-sample_diff); aligned_ref = ref(sample_diff+1:end); else aligned_src = src(-sample_diff+1:end); aligned_ref = ref(1:end+sample_diff); end % 确保等长 min_len = min(length(aligned_src), length(aligned_ref)); aligned_src = aligned_src(1:min_len); aligned_ref = aligned_ref(1:min_len); end function mfcc = mfcc_feature(audio, fs, frame_size, hop_size) % 参数验证 if nargin < 4 hop_size = round(frame_size/2); % 默认50%重叠 end % 预处理：预加重 audio = filter([1 -0.97], 1, audio); % 分帧处理 frames = buffer(audio, frame_size, frame_size - hop_size, 'nodelay'); num_frames = size(frames, 2); % 加窗（汉明窗） window = hamming(frame_size); windowed_frames = frames .* repmat(window, 1, num_frames); % 计算功率谱 nfft = 2^nextpow2(frame_size); mag_frames = abs(fft(windowed_frames, nfft)); power_frames = (mag_frames(1:nfft/2+1, :)).^2; % 设计梅尔滤波器组 num_filters = 26; % 滤波器数量 mel_min = 0; % 最小Mel频率 mel_max = 2595 * log10(1 + (fs/2)/700); % 最大Mel频率 % 创建等间隔的Mel频率点 mel_points = linspace(mel_min, mel_max, num_filters + 2); % 将Mel频率转换为线性频率 hz_points = 700 * (10.^(mel_points/2595) - 1); % 转换为FFT bin索引 bin_indices = floor((nfft+1) * hz_points / fs); % 创建梅尔滤波器组 filter_bank = zeros(num_filters, nfft/2+1); for m = 2:num_filters+1 left = bin_indices(m-1); center = bin_indices(m); right = bin_indices(m+1); % 左侧斜坡 for k = left:center-1 filter_bank(m-1, k+1) = (k - left) / (center - left); end % 右侧斜坡 for k = center:right filter_bank(m-1, k+1) = (right - k) / (right - center); end end % 应用梅尔滤波器组 mel_spectrum = filter_bank * power_frames; % 取对数 log_mel = log(mel_spectrum + eps); % 计算DCT得到MFCC系数 mfcc = dct(log_mel); % 保留前13个系数（含能量系数） mfcc = mfcc(1:13, :); % 可选：添加能量特征 energy = log(sum(power_frames) + eps); mfcc(1, :) = energy; % 替换第0阶MFCC为对数能量 % 应用倒谱均值归一化 (CMN) mfcc = mfcc - mean(mfcc, 2); end function [f0, time] = extract_pitch(audio, fs) % 使用改进的自相关方法 frame_size = round(0.05 * fs); hop_size = round(0.025 * fs); n_frames = floor((length(audio) - frame_size) / hop_size) + 1; f0 = zeros(1, n_frames); time = (0:n_frames-1)hop_size/fs + frame_size/(2fs); % 预处理：带通滤波和预加重 [b, a] = butter(4, [80, 2000]/(fs/2), 'bandpass'); audio = filtfilt(b, a, audio); audio = filter([1, -0.97], 1, audio); % 预加重 for i = 1:n_frames start_idx = (i-1)hop_size + 1; frame = audio(start_idx:start_idx+frame_size-1); % 归一化自相关函数 autocorr = xcorr(frame, 'normalized'); autocorr = autocorr(frame_size:end); % 取非负延迟部分 % 寻找第一个显著峰值 [peaks, locs] = findpeaks(autocorr, 'MinPeakHeight', 0.3); if ~isempty(locs) % 找到最低频率的显著峰值 valid_locs = locs(peaks > 0.5max(peaks)); if ~isempty(valid_locs) tau = valid_locs(1); else [~, tau] = max(autocorr); end else [~, tau] = max(autocorr); end % 二次插值 if tau > 1 && tau < length(autocorr)-1 ac_vals = autocorr(tau-1:tau+1); delta = (ac_vals(1) - ac_vals(3)) / (2(2ac_vals(2) - ac_vals(1) - ac_vals(3))); tau = tau + delta; end % 计算基频 f0(i) = fs / tau; end % 后处理：改进的平滑和插值 valid = f0 > 80 & f0 < 1000; f0(~valid) = NaN; f0 = fillmissing(f0, 'movmedian', 10); f0 = fillmissing(f0, 'pchip'); % 谐波增强：验证基频和谐波一致性 for i = 1:length(f0) if ~isnan(f0(i)) % 检查第二谐波是否存在 harmonic_freq = 2f0(i); harmonic_bin = round(harmonic_freq frame_size / fs); if harmonic_bin <= frame_size/2 frame_start = (i-1)hop_size + 1; frame = audio(frame_start:frame_start+frame_size-1); spectrum = abs(fft(frame)); harmonic_strength = spectrum(harmonic_bin+1); fundamental_strength = spectrum(round(f0(i)frame_size/fs)+1); % 如果谐波强度不足，降低置信度 if harmonic_strength < 0.5fundamental_strength f0(i) = NaN; end end end end % 最终插值 f0 = fillmissing(f0, 'pchip'); end function [corrected, f0_corrected] = correct_pitch(fig, audio, fs, f0_src, f0_ref, time_src, time_ref) % 创建进度条 h = uiprogressdlg(fig, 'Title', '处理中', 'Message', '音高校正...'); % 动态计算最优段长（基于音高变化率） f0_variation = mean(abs(diff(f0_src(f0_src > 0)))); segment_duration = max(0.1, min(0.5, 0.3/(f0_variation/50 + 0.1))); % 自适应段长 segment_samples = round(segment_duration fs); n_segments = ceil(length(audio) / segment_samples); corrected = zeros(size(audio)); f0_corrected = zeros(size(f0_src)); % 创建参考音高插值函数（使用形状保持插值） valid_ref = f0_ref > 0; if any(valid_ref) ref_interp = @(t) interp1(time_ref(valid_ref), f0_ref(valid_ref), t, 'pchip', 'extrap'); else ref_interp = @(t) 0; end % 创建音高变化强度因子（基于音高差异） pitch_diff = abs(f0_src - ref_interp(time_src)); pitch_diff(pitch_diff < 20) = 0; % 忽略微小差异 intensity_factor = min(2, 1 + pitch_diff/100); % 1-2倍强度因子 for seg = 1:n_segments h.Value = seg/n_segments; h.Message = sprintf('处理段 %d/%d (%.1f%%)', seg, n_segments, seg/n_segments100); % 获取当前段 start_idx = max(1, (seg-1)segment_samples + 1); end_idx = min(length(audio), segsegment_samples); segment_audio = audio(start_idx:end_idx); % 计算段内平均音高（加权平均） seg_time = time_src(time_src >= (start_idx-1)/fs & time_src <= end_idx/fs); valid_seg = f0_src >= start_idx/fs & f0_src <= end_idx/fs & f0_src > 0; if any(valid_seg) % 计算加权平均（差异大的部分权重更高） weights = intensity_factor(valid_seg); mean_src = sum(f0_src(valid_seg).weights) / sum(weights); mean_ref = sum(ref_interp(seg_time).weights) / sum(weights); ratio = mean_ref / mean_src; else ratio = 1; end % 应用强度因子增强变化 seg_intensity = mean(intensity_factor(valid_seg)); ratio = ratio^seg_intensity; % 指数增强 % 限制比例范围（更严格的限制） ratio = max(0.8, min(1.25, ratio)); % 应用增强的相位声码器 try corrected_seg = enhanced_phase_vocoder(segment_audio, ratio, fs); catch ME corrected_seg = segment_audio; end % 存储结果 seg_end = min(length(corrected), start_idx + length(corrected_seg) - 1); corrected(start_idx:seg_end) = corrected_seg(1:min(length(corrected_seg), seg_end-start_idx+1)); % 增强的交叉淡入淡出处理（余弦渐变） if seg > 1 fade_samples = round(0.03 fs); % 30ms淡入淡出 prev_end = (seg-1)segment_samples; fade_range = max(1, prev_end-fade_samples+1):prev_end; if fade_range(end) <= length(corrected) && fade_range(1) > 0 fade_in = (1 - cos(linspace(0, pi, fade_samples)))/2; fade_out = (1 + cos(linspace(0, pi, fade_samples)))/2; % 确保长度匹配 if length(fade_in) > length(fade_range) fade_in = fade_in(1:length(fade_range)); fade_out = fade_out(1:length(fade_range)); end % 应用交叉混合 corrected(fade_range) = corrected(fade_range).fade_out(:) + ... corrected_seg(1:length(fade_range)).fade_in(:); end end end % 重新提取矫正后的音高 [f0_corrected, time_corr] = extract_pitch(corrected, fs); % 后处理：应用音高导向的平滑滤波器 f0_diff = abs(f0_corrected - ref_interp(time_corr)); smooth_window = max(3, min(15, round(f0_diff/5))); % 根据差异调整平滑窗口 f0_corrected = movmedian(f0_corrected, smooth_window); % 确保数据格式正确 corrected = real(corrected); corrected = corrected / max(abs(corrected)); % 归一化 close(h); end function y = enhanced_phase_vocoder(x, ratio, fs) % 自适应帧长（高频用较短帧，低频用较长帧） avg_pitch = mean(pitch(x, fs)); % 需要pitch函数 frame_size = round(min(4096, max(1024, 2048 (200/avg_pitch)))); overlap = round(frame_size * 0.75); hop_in = frame_size - overlap; hop_out = round(hop_in * ratio); % 使用改进的STFT处理（汉宁窗） win = hann(frame_size, 'periodic'); [S, f, t] = stft(x, fs, 'Window', win, 'OverlapLength', overlap, 'FFTLength', frame_size); % 相位处理 Y = enhanced_phase_processing(S, hop_in, hop_out, fs); % 重建信号（使用加权重叠相加法） y = istft(Y, fs, 'Window', win, 'OverlapLength', frame_size - hop_out, ... 'FFTLength', frame_size, 'Method', 'wola'); % 长度匹配 if length(y) > length(x) y = y(1:length(x)); elseif length(y) < length(x) y = [y; zeros(length(x)-length(y), 1)]; end % 后处理：谱平滑减少人工痕迹 y = spectral_smoothing(y, fs, ratio); end function y = spectral_smoothing(x, fs, ratio) % 应用低通滤波减少高频人工痕迹 cutoff = min(8000, 20000 / ratio^0.5); % 自适应截止频率 [b, a] = butter(4, cutoff/(fs/2), 'low'); y = filtfilt(b, a, x); end function Y = enhanced_phase_processing(X, hop_in, hop_out, fs) Y = zeros(size(X)); if isempty(X), return; end n_bins = size(X, 1); freq_bins = (0:n_bins-1)' * fs / (2(n_bins-1)); bin_phase_inc = 2pi * freq_bins * hop_in / fs; phase_prev = angle(X(:,1)); Y(:,1) = abs(X(:,1)) .* exp(1jphase_prev); for k = 2:size(X,2) mag = abs(X(:,k)); phase = angle(X(:,k)); % 计算相位增量（考虑瞬时频率） delta_phase = phase - phase_prev - bin_phase_inc; % 相位展开（改进方法） delta_phase = delta_phase - 2piround(delta_phase/(2pi)); % 计算真实瞬时频率 inst_freq = bin_phase_inc + delta_phase; % 相位累积（考虑时间伸缩） adjusted_phase = phase_prev + inst_freq * hop_out / hop_in; % 相位一致性调整 if k > 2 phase_diff = adjusted_phase - angle(Y(:,k-1)); phase_diff = phase_diff - 2piround(phase_diff/(2pi)); adjusted_phase = angle(Y(:,k-1)) + phase_diff; end % 合成新帧 Y(:,k) = mag . exp(1jadjusted_phase); % 更新前一帧相位 phase_prev = adjusted_phase; end end function plot_pitch_comparison(time_src, f0_src, time_ref, f0_ref, f0_corrected, src_wave, ref_wave, corr_wave, fs) % 确保所有序列长度一致 min_length = min([length(time_src), length(time_ref), length(f0_corrected)]); time_src = time_src(1:min_length); f0_src = f0_src(1:min_length); time_ref = time_ref(1:min_length); f0_ref = f0_ref(1:min_length); f0_corrected = f0_corrected(1:min_length); % 创建综合音高对比图（包含波形和音高） pitch_fig = figure('Name', '音频波形与音高分析', 'Position', [100 100 900 800]); % 原始音频波形 + 音高 subplot(3,1,1); time_wave_src = (1:length(src_wave)) / fs; yyaxis left; plot(time_wave_src, src_wave, 'Color', [0.7 0.7 1], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_src, f0_src, 'b', 'LineWidth', 1.5); hold on; plot(time_ref, f0_ref, 'r--', 'LineWidth', 1.5); hold off; title('原始音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('原始波形', '原始音高', '参考音高', 'Location', 'best'); grid on; % 参考音频波形 + 音高 subplot(3,1,2); time_wave_ref = (1:length(ref_wave)) / fs; yyaxis left; plot(time_wave_ref, ref_wave, 'Color', [1 0.7 0.7], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_ref, f0_ref, 'r', 'LineWidth', 1.5); title('参考音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('参考波形', '参考音高', 'Location', 'best'); grid on; % 矫正后音频波形 + 音高 subplot(3,1,3); time_wave_corr = (1:length(corr_wave)) / fs; yyaxis left; plot(time_wave_corr, corr_wave, 'Color', [0.7 1 0.7], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_src, f0_src, 'b:', 'LineWidth', 1); hold on; plot(time_ref, f0_ref, 'r--', 'LineWidth', 1); plot(time_src, f0_corrected, 'g', 'LineWidth', 2); hold off; title('矫正后音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('矫正波形', '原始音高', '参考音高', '矫正音高', 'Location', 'best'); grid on; % 添加音高误差分析 valid_idx = (f0_src > 0) & (f0_ref > 0) & (f0_corrected > 0); if any(valid_idx) src_error = mean(abs(f0_src(valid_idx) - f0_ref(valid_idx))); corr_error = mean(abs(f0_corrected(valid_idx) - f0_ref(valid_idx))); annotation(pitch_fig, 'textbox', [0.15 0.05 0.7 0.05], ... 'String', sprintf('原始音高平均误差: %.2f Hz | 矫正后音高平均误差: %.2f Hz | 改进: %.1f%%', ... src_error, corr_error, (src_error - corr_error)/src_error100), ... 'FitBoxToText', 'on', 'BackgroundColor', [0.9 0.9 0.9], ... 'FontSize', 12, 'HorizontalAlignment', 'center'); end end function play_audio(fig, audio_type) if ~isvalid(fig) errordlg('主窗口无效!', '播放错误'); return; end switch audio_type case 'source' audio = fig.UserData.source_audio; title_text = '播放原始音频'; if isempty(audio) errordlg('未找到原始音频数据!', '播放错误'); return; end case 'corrected' audio = fig.UserData.corrected_audio; title_text = '播放矫正音频'; if isempty(audio) errordlg('请先完成音高校正!', '播放错误'); return; end otherwise return; end fs = fig.UserData.fs; player = audioplayer(audio, fs); % 创建播放控制界面 play_fig = uifigure('Name', title_text, 'Position', [500 500 300 150]); % 播放进度条 ax = uiaxes(play_fig, 'Position', [50 100 200 20]); hold(ax, 'on'); prog_line = plot(ax, [0 0], [0 1], 'b', 'LineWidth', 2); % 垂直范围[0,1] hold(ax, 'off'); xlim(ax, [0 1]); ylim(ax, [0 1]); set(ax, 'XTick', [], 'YTick', []); % 播放时间显示 time_label = uilabel(play_fig, 'Position', [50 80 200 20], ... 'Text', '00:00 / 00:00', 'HorizontalAlignment', 'center'); % 控制按钮 uibutton(play_fig, 'Position', [50 30 60 30], 'Text', '播放', ... 'ButtonPushedFcn', @(btn,event) play(player)); uibutton(play_fig, 'Position', [120 30 60 30], 'Text', '暂停', ... 'ButtonPushedFcn', @(btn,event) pause(player)); uibutton(play_fig, 'Position', [190 30 60 30], 'Text', '停止', ... 'ButtonPushedFcn', @(btn,event) stop(player)); % 总时长计算 total_time = length(audio)/fs; mins = floor(total_time/60); secs = round(total_time - mins60); total_str = sprintf('%02d:%02d', mins, secs); % 更新播放进度回调 player.TimerFcn = {@update_playback, play_fig, time_label, total_str, prog_line, length(audio)}; player.TimerPeriod = 0.1; % 更新频率(秒) player.StopFcn = @(src,event) stop_playback(src, event, play_fig); end function stop_playback(src, ~, fig) stop(src); if isvalid(fig) close(fig); end end function save_audio(fig) if ~isvalid(fig) || isempty(fig.UserData.corrected_audio) errordlg('无有效音频数据可保存!', '保存错误'); return; end [file, path] = uiputfile('.wav', '保存矫正音频'); if isequal(file, 0), return; end audiowrite(fullfile(path, file), fig.UserData.corrected_audio, fig.UserData.fs); msgbox('音频保存成功!', '完成'); end 矫正失败，显示Y应为实数

我们还需要确保在`enhanced_phase_vocoder`函数中，进行谱平滑后处理时，输入的是实数信号，但谱平滑（低通滤波）函数`spectral_smoothing`内部使用的是`filtfilt`，它要求输入是实数，所以我们在取实部后再进行谱...

matlab开发-zoomFFT

matlab开发-zoomFFT。此程序计算时间历史的缩放FFT。

zoomfft-matlab学习

ZoomFFT 原理及matlab代码实现

频谱细化ZOOMFFT.m

利用MATLAB实现了频谱细化，采用的方法是ZOOMFFT，实验结果表明经过细化后的频谱有更高的频率分辨率

选带快速傅立叶变换ZOOM-FFT的matlab仿真,带GUI界面，含仿真操作录像

1.版本：matlab2021a，包含仿真操作录像，操作录像使用windows media player播放。 2.领域：ZOOM-FFT 3.内容： ZOOM-FFT称为细化的快速傅立叶变换，又称为选带快速傅立叶变换。ZOOM-FFT的功能是对信号的频率进行局部细化放大，使感兴趣的频带获得较高的频率分辨率。 % 步骤一：乘以exp zoom_fft_xx = (x_real_zoom+j*x_imag_zoom).*exp(-j*2*pi*(0:N-1)*frequency_shift/Fs); % 步骤二：数字低通+重采样 zoom_fft_xx = zoom_fft_xx.*w; % 步骤三：FFT变化 zoom_fft_xx = fft(zoom_fft_xx); % 步骤四：频率调整 zoomfft_x = zoomfft_x+(abs(zoom_fft_xx(1:N/2))/N*2).^2; 4.注意事项：注意MATLAB左侧当前文件夹路径，必须是程序所在文件夹位置，具体可以参考视频录。

基于ZoomFFT的算法研究及其仿真分析

基于ZoomFFT的算法研究及其仿真分析，秦媛倩，唐轶，本文在介绍复调制ZoomFFT方法原理的基础上，提出了一种基于Zoomfft的基波检测算法，并用MATLAB软件进行了仿真实现，证明了该算法的有效�

[C/C++线程安全]_[中级]_[如何取消线程和停止线程]

场景在开发多线程程序时，经常由于需要重新执行任务从而取消(停止)工作线程。C++11目前并没有很好的取消线程执行的机制。那么我们应该如何实现取消线程呢？说明 C++11使用<future>库进行线程间数据通讯，也可以利用它来进行控制线程停止。但是没有pthread那么强大，pthread有取消点函数，线程执行到取消点函数就会判断如何线程状态设置为取消，那么就会调用预先设定的线程清理函数清理资源，而取消点之后的代码块不会执行。而C++11并没有取消点的设计。只有在工作线程里判断某个值如

项目过程与项目管理过程.pptx

ZOOM-FFT算法及其在频率细化中的应用

ZOOM-FFT.rar_ZOOM-FFT C_zoom fft_zoom-FFT_zoom-fft c_频谱

zoomFFT.txt

zoomFFT_zoomfft_zoomfft_

SFFT_光相位调制器_LCOS_光相位调制_S-FFT_菲涅尔衍射_

STR-IMPL.rar_STRAIGHT

exam1_sin.rar_功率 密度_功率谱_功率谱密度_正弦信号

dft.rar_DFT_DFT FORTRAN_傅立叶反变换_功率谱计算_谐波激励法

通过各种调制的功率谱不同来识别.rar_matlab例程_matlab_

16QAM.rar_matlab例程_matlab_

22MultipathChannel.rar_multipath channel_多径_多径仿真_多径信道_多径信道 matla

x1_fft = torch.fft.fft2(x1, norm='ortho') x1_amp = torch.abs(x1_fft) x1_phase = torch.angle(x1_fft) x1_amp, x1_phase代表幅度和相位，如何按照上述进行处理，请告诉我如何通过代码实现

matlab开发-zoomFFT

zoomfft-matlab学习

频谱细化ZOOMFFT.m

选带快速傅立叶变换ZOOM-FFT的matlab仿真,带GUI界面，含仿真操作录像

基于ZoomFFT的算法研究及其仿真分析

[C/C++线程安全]_[中级]_[如何取消线程和停止线程]

项目过程与项目管理过程.pptx

最新资源

exam1_sin.rar_功率密度_功率谱_功率谱密度_正弦信号