file-type

ZOOM-FFT算法及其在频率细化中的应用

版权申诉

RAR文件

5星 · 超过95%的资源 | 11KB | 更新于2024-12-16 | 10 浏览量 | 1 下载量 举报 3 收藏
download 限时特惠:#11.90
ZOOM-FFT算法是一种用于频率分析的信号处理技术,它通过细化FFT(快速傅立叶变换)的频谱分析结果来提供对特定频带的高分辨率观察。该技术非常适合在信号处理中,当需要观察信号中的特定部分,同时保持较低的采样率以节省资源或满足实时性要求时使用。 FFT算法补偿是针对快速傅立叶变换中存在的频谱泄露和混叠等问题而提出的一种改进方法。FFT算法在分析信号时,如果采样频率和信号的频率成分不满足奈奎斯特采样定理,就会出现这些问题。算法补偿通过增加信号的长度或在频域内进行插值等方式,对FFT算法输出的频谱进行调整,以得到更准确的结果。 Spectral method(频谱方法)是分析信号频谱特性的数学技术,通常用于信号处理、波动理论和量子物理等领域。这种方法通过将信号分解为不同的频率成分来研究信号的特性。在ZOOM-FFT算法中,频谱方法有助于识别和分析特定频带内的频率成分。 相位补偿法是针对FFT分析中相位失真的修正方法。在FFT分析中,由于信号的不连续性等因素,会导致计算出的相位谱与实际相位存在偏差。通过相位补偿技术,可以对这些偏差进行校正,从而获得更精确的相位信息。 相位谱补偿是相位补偿法在频谱分析中的应用,它的目的是修正FFT输出中相位信息的误差,以更准确地反映原始信号的相位特性。相位谱的准确度对于信号的时间和频率同步分析尤其重要。 频率细化是ZOOM-FFT算法的核心功能,它通过算法优化和信号处理技术,实现了对信号特定频率区域的高分辨率观察。这种细化通常通过减少信号分析的总频带宽度,但同时增加该频带内的采样点数目来实现。 ZOOM-FFT算法的核心概念在于选择感兴趣的频带并对其进行放大处理,而保持总采样频率不变。这一过程中,算法通过对选定频带内的数据进行插值或频移操作,达到了提高该频带内部频率分辨率的效果,而不增加整体的计算复杂度。这使得ZOOM-FFT成为一种资源高效、针对性强的频率分析工具。 在实际应用中,频移法是ZOOM-FFT算法实现频率细化的常用手段。这种方法利用频移定理,将感兴趣的频带通过一定的频率偏移,使之能够利用标准FFT算法进行更细致的分析。通过这种方式,可以在不显著增加计算负担的情况下,获得特定频带的高分辨率频谱。 综上所述,ZOOM-FFT算法通过细化FFT分析的特定频段,以提供局部高分辨率的频谱分析结果。这在通信、声学、生物医学工程等需要对信号特定部分进行深入研究的领域中具有重要价值。通过算法补偿和频谱方法,ZOOM-FFT能够有效地解决信号处理中的一些常见问题,如频率泄露、混叠和相位失真,从而为工程师和技术人员提供了强有力的工具来分析和处理复杂信号。

相关推荐

filetype

function timbre_transfer_2 % 创建主界面 fig = figure('Name', '高级音色转换系统 v3.2', 'Position', [50, 50, 1200, 900], ... 'NumberTitle', 'off', 'MenuBar', 'none', 'Resize', 'on', ... 'CloseRequestFcn', @close_gui, 'Color', [0.94, 0.94, 0.94]); % 全局变量 fs = 44100; % 默认采样率 source_audio = []; % 源音频(提供音色) target_audio = []; % 目标音频(提供内容) converted_audio = []; % 转换后的音频 processing = false; % 处理状态标志 conversion_complete = false; % 转换完成标志 % STFT参数 stft_params.win_len = 2048; % 窗长 stft_params.overlap = 1536; % 重叠点数 (75%) stft_params.nfft = 2048; % FFT点数 stft_params.window = hamming(stft_params.win_len, 'periodic'); % 汉明窗 stft_params.lifter_order = 30; % 包络阶数 stft_params.phase_iter = 5; % 相位迭代次数 stft_params.fs = fs; % 采样率参数 stft_params.hop_size = stft_params.win_len - stft_params.overlap; % 跳跃长度 % 计算合成窗 (确保完美重建) stft_params.win_synthesis = stft_params.window / sum(stft_params.window.^2) * stft_params.hop_size; % === 创建控件 === % 顶部控制面板 control_panel = uipanel('Title', '音频控制', 'Position', [0.02, 0.92, 0.96, 0.07], ... 'BackgroundColor', [0.9, 0.95, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '导入源音频(音色)',... 'Position', [20, 10, 150, 30], 'Callback', @load_source, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.7, 0.9, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '导入目标音频(内容)',... 'Position', [190, 10, 150, 30], 'Callback', @load_target, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.7, 0.9, 1]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '执行音色转换',... 'Position', [360, 10, 150, 30], 'Callback', @transfer_timbre, ... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [0.8, 1, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '播放目标音频',... 'Position', [530, 10, 120, 30], 'Callback', {@play_target_audio}, ... 'FontSize', 10, 'BackgroundColor', [1, 0.95, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '播放转换音频',... 'Position', [670, 10, 120, 30], 'Callback', {@play_converted_audio}, ... 'FontSize', 10, 'BackgroundColor', [1, 0.95, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '保存转换音频',... 'Position', [810, 10, 120, 30], 'Callback', @save_audio, ... 'FontSize', 10, 'BackgroundColor', [0.9, 1, 0.8]); uicontrol('Parent', control_panel, 'Style', 'pushbutton', 'String', '停止播放',... 'Position', [950, 10, 120, 30], 'Callback', @stop_audio, ... 'FontSize', 10, 'BackgroundColor', [1, 0.8, 0.8]); % 参数控制面板 param_panel = uipanel('Title', 'STFT参数设置', 'Position', [0.02, 0.82, 0.96, 0.09], ... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '窗长:',... 'Position', [20, 40, 50, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); win_len_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.win_len),... 'Position', [80, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '重叠率(%):',... 'Position', [180, 40, 70, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); overlap_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', '75',... 'Position', [260, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', 'FFT点数:',... 'Position', [360, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); nfft_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.nfft),... 'Position', [430, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '包络阶数:',... 'Position', [530, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); lifter_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.lifter_order),... 'Position', [600, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); uicontrol('Parent', param_panel, 'Style', 'text', 'String', '相位迭代:',... 'Position', [700, 40, 60, 20], 'HorizontalAlignment', 'left',... 'BackgroundColor', [0.95, 0.97, 1], 'FontWeight', 'bold'); iter_edit = uicontrol('Parent', param_panel, 'Style', 'edit',... 'String', num2str(stft_params.phase_iter),... 'Position', [770, 40, 80, 25], 'Callback', @update_params, ... 'BackgroundColor', [1, 1, 1]); % 波形显示区域 - 使用选项卡 tabgp = uitabgroup(fig, 'Position', [0.02, 0.02, 0.96, 0.35]); tab1 = uitab(tabgp, 'Title', '目标音频'); tab2 = uitab(tabgp, 'Title', '转换后音频'); tab3 = uitab(tabgp, 'Title', '源音频'); ax1 = axes('Parent', tab1, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax1, '目标音频波形'); xlabel(ax1, '时间 (s)'); ylabel(ax1, '幅度'); grid(ax1, 'on'); ax2 = axes('Parent', tab2, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax2, '转换后音频波形'); xlabel(ax2, '时间 (s)'); ylabel(ax2, '幅度'); grid(ax2, 'on'); ax3 = axes('Parent', tab3, 'Position', [0.07, 0.15, 0.9, 0.75]); title(ax3, '源音频波形'); xlabel(ax3, '时间 (s)'); ylabel(ax3, '幅度'); grid(ax3, 'on'); % 频谱显示区域(只保留三个频谱图) spec_panel = uipanel('Title', '频谱分析', 'Position', [0.02, 0.38, 0.96, 0.43], ... 'BackgroundColor', [0.98, 0.98, 0.98], 'FontWeight', 'bold'); % 增大频谱图尺寸(垂直方向) ax4 = axes('Parent', spec_panel, 'Position', [0.03, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax4, '源音频频谱'); ax5 = axes('Parent', spec_panel, 'Position', [0.36, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax5, '目标音频频谱'); ax6 = axes('Parent', spec_panel, 'Position', [0.69, 0.1, 0.3, 0.8]); % 高度增加到80% title(ax6, '转换后频谱'); % 状态文本 status_text = uicontrol('Style', 'text', 'Position', [50, 5, 900, 30],... 'String', '就绪', 'HorizontalAlignment', 'left',... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [1, 1, 1]); % 进度条 progress_ax = axes('Position', [0.1, 0.97, 0.8, 0.02],... 'XLim', [0, 1], 'YLim', [0, 1], 'Box', 'on', 'Color', [0.9, 0.9, 0.9]); progress_bar = patch(progress_ax, [0 0 0 0], [0 0 1 1], [0.2, 0.6, 1]); axis(progress_ax, 'off'); progress_text = uicontrol('Style', 'text', 'Position', [500, 970, 200, 20],... 'String', '', 'HorizontalAlignment', 'center',... 'FontSize', 10, 'FontWeight', 'bold', 'BackgroundColor', [1, 1, 1]); % 诊断信息面板 diag_panel = uipanel('Title', '处理日志', 'Position', [0.02, 0.02, 0.96, 0.35], ... 'BackgroundColor', [0.95, 0.95, 0.95], 'Visible', 'off'); diag_text = uicontrol('Parent', diag_panel, 'Style', 'listbox', ... 'Position', [10, 10, 1140, 250], 'String', {'系统已初始化'}, ... 'HorizontalAlignment', 'left', 'FontSize', 9, ... 'BackgroundColor', [1, 1, 1], 'Max', 100, 'Min', 0); % 添加显示/隐藏日志按钮 uicontrol('Style', 'pushbutton', 'String', '显示日志',... 'Position', [1125, 928, 120, 30], 'Callback', @toggle_log, ... 'FontSize', 9, 'BackgroundColor', [0.9, 0.95, 1]); % === 回调函数 === % 更新参数回调 function update_params(~, ~) try % 获取新参数值 new_win_len = str2double(get(win_len_edit, 'String')); overlap_percent = str2double(get(overlap_edit, 'String')); new_nfft = str2double(get(nfft_edit, 'String')); lifter_order = str2double(get(lifter_edit, 'String')); phase_iter = str2double(get(iter_edit, 'String')); % 验证参数 if isnan(new_win_len) || new_win_len <= 0 || mod(new_win_len, 1) ~= 0 error('窗长必须是正整数'); end if isnan(overlap_percent) || overlap_percent < 0 || overlap_percent > 100 error('重叠率必须是0-100之间的数字'); end if isnan(new_nfft) || new_nfft <= 0 || mod(new_nfft, 1) ~= 0 error('FFT点数必须是正整数'); end if isnan(lifter_order) || lifter_order <= 0 || mod(lifter_order, 1) ~= 0 error('包络阶数必须是正整数'); end if isnan(phase_iter) || phase_iter <= 0 || mod(phase_iter, 1) ~= 0 error('相位迭代次数必须是正整数'); end % 更新参数 stft_params.win_len = new_win_len; stft_params.overlap = round(overlap_percent/100 * new_win_len); stft_params.nfft = new_nfft; stft_params.window = hamming(new_win_len, 'periodic'); stft_params.lifter_order = lifter_order; stft_params.phase_iter = phase_iter; stft_params.hop_size = stft_params.win_len - stft_params.overlap; stft_params.win_synthesis = stft_params.window / sum(stft_params.window.^2) * stft_params.hop_size; update_diag(sprintf('参数更新: 窗长=%d, 重叠=%d(%.0f%%), FFT=%d', ... new_win_len, stft_params.overlap, overlap_percent, new_nfft)); catch e errordlg(['参数错误: ', e.message], '输入错误'); update_diag(['参数错误: ', e.message], true); end end % 更新诊断信息 function update_diag(msg, force) if nargin < 2, force = false; end if ~conversion_complete || force current = get(diag_text, 'String'); new_msg = sprintf('[%s] %s', datestr(now, 'HH:MM:SS'), msg); set(diag_text, 'String', [current; {new_msg}]); set(diag_text, 'Value', length(get(diag_text, 'String'))); end end % 切换日志显示 function toggle_log(~, ~) if strcmp(get(diag_panel, 'Visible'), 'on') set(diag_panel, 'Visible', 'off'); set(tabgp, 'Position', [0.02, 0.02, 0.96, 0.35]); else set(diag_panel, 'Visible', 'on'); set(tabgp, 'Position', [0.02, 0.38, 0.96, 0.35]); end end % 关闭GUI回调 function close_gui(~, ~) if processing choice = questdlg('处理正在进行中,确定要关闭吗?', '确认关闭', '是', '否', '否'); if strcmp(choice, '否') return; end end stop_audio(); delete(fig); end % 导入源音频 function load_source(~, ~) if processing, return; end [file, path] = uigetfile({'*.wav;*.mp3;*.ogg', '音频文件 (*.wav,*.mp3,*.ogg)'}); if isequal(file, 0), return; end try [audio, fs_in] = audioread(fullfile(path, file)); update_diag(['加载源音频: ', file, ' (', num2str(fs_in), 'Hz)']); set(status_text, 'String', ['正在处理源音频: ', file]); drawnow; % 转换为单声道并归一化 if size(audio, 2) > 1 source_audio = mean(audio, 2); update_diag('转换为单声道'); else source_audio = audio; end source_audio = source_audio / max(abs(source_audio)); update_diag('归一化完成'); % 更新采样率参数 stft_params.fs = fs; % 采样率处理 if fs == 0 fs = fs_in; elseif fs ~= fs_in update_diag(['重采样: ', num2str(fs_in), 'Hz -> ', num2str(fs), 'Hz']); source_audio = resample(source_audio, fs, fs_in); end % 显示波形和频谱 plot(ax3, (0:length(source_audio)-1)/fs, source_audio); title(ax3, ['源音频波形: ', file]); xlabel(ax3, '时间 (s)'); ylabel(ax3, '幅度'); grid(ax3, 'on'); % 显示频谱 show_spectrum(ax4, source_audio, fs, stft_params, '源音频频谱'); set(status_text, 'String', ['已加载源音频: ', file, ' (', num2str(fs/1000), 'kHz)']); update_diag(['源音频长度: ', num2str(length(source_audio)/fs), '秒']); % 重置转换完成标志 conversion_complete = false; catch e errordlg(['加载源音频失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); end end % 导入目标音频 function load_target(~, ~) if processing, return; end [file, path] = uigetfile({'*.wav;*.mp3;*.ogg', '音频文件 (*.wav,*.mp3,*.ogg)'}); if isequal(file, 0), return; end try [audio, fs_in] = audioread(fullfile(path, file)); update_diag(['加载目标音频: ', file, ' (', num2str(fs_in), 'Hz)']); set(status_text, 'String', ['正在处理目标音频: ', file]); drawnow; % 转换为单声道并归一化 if size(audio, 2) > 1 target_audio = mean(audio, 2); update_diag('转换为单声道'); else target_audio = audio; end target_audio = target_audio / max(abs(target_audio)); update_diag('归一化完成'); % 更新采样率参数 stft_params.fs = fs; % 采样率处理 if fs == 0 fs = fs_in; elseif fs ~= fs_in update_diag(['重采样: ', num2str(fs_in), 'Hz -> ', num2str(fs), 'Hz']); target_audio = resample(target_audio, fs, fs_in); end % 显示波形和频谱 plot(ax1, (0:length(target_audio)-1)/fs, target_audio); title(ax1, ['目标音频波形: ', file]); xlabel(ax1, '时间 (s)'); ylabel(ax1, '幅度'); grid(ax1, 'on'); % 显示频谱 show_spectrum(ax5, target_audio, fs, stft_params, '目标音频频谱'); set(status_text, 'String', ['已加载目标音频: ', file, ' (', num2str(fs/1000), 'kHz)']); update_diag(['目标音频长度: ', num2str(length(target_audio)/fs), '秒']); % 重置转换完成标志 conversion_complete = false; catch e errordlg(['加载目标音频失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); end end %% === 在transfer_timbre末尾添加后处理 === function transfer_timbre(~, ~) if processing, return; end if isempty(source_audio) || isempty(target_audio) errordlg('请先导入源音频和目标音频!', '错误'); return; end % 设置处理状态 processing = true; conversion_complete = false; set(status_text, 'String', '开始音色转换...'); update_diag('=== 开始音色转换 ==='); drawnow; % 统一音频长度(以目标音频长度为基准) target_length = length(target_audio); source_length = length(source_audio); if source_length < target_length % 源音频较短,重复填充 num_repeat = ceil(target_length / source_length); extended_source = repmat(source_audio, num_repeat, 1); source_audio_adj = extended_source(1:target_length); update_diag('源音频已扩展以匹配目标长度'); elseif source_length > target_length % 源音频较长,截断 source_audio_adj = source_audio(1:target_length); update_diag('源音频已截断以匹配目标长度'); else source_audio_adj = source_audio; end % 确保长度兼容 target_audio_adj = target_audio(1:min(target_length, length(source_audio_adj))); source_audio_adj = source_audio_adj(1:min(target_length, length(source_audio_adj))); try % === 瞬态检测 === update_diag('检测瞬态区域...'); transients = detect_transients(target_audio_adj, stft_params.win_len, stft_params.hop_size); % === 目标音频STFT === update_diag('对目标音频进行STFT...'); update_progress(0.1, '目标音频STFT'); [mag_target, phase_target] = optimized_stft(target_audio_adj, stft_params, @update_progress); update_diag(sprintf('目标音频STFT完成: %d帧', size(mag_target,2))); % === 源音频STFT === update_diag('对源音频进行STFT...'); update_progress(0.3, '源音频STFT'); [mag_source] = optimized_stft(source_audio_adj, stft_params, @update_progress); update_diag(sprintf('源音频STFT完成: %d帧', size(mag_source,2))); % 确保频谱矩阵大小相同 if size(mag_target, 2) ~= size(mag_source, 2) min_frames = min(size(mag_target, 2), size(mag_source, 2)); mag_target = mag_target(:, 1:min_frames); mag_source = mag_source(:, 1:min_frames); phase_target = phase_target(:, 1:min_frames); update_diag(sprintf('调整频谱帧数: %d帧', min_frames)); end % === 改进的频谱转换算法 === update_diag('应用改进的音色转换算法...'); update_progress(0.65, '频谱转换'); % 1. 计算源音频的频谱包络 mag_source_env = spectral_envelope(mag_source, stft_params.lifter_order, stft_params.nfft); % 2. 计算目标音频的频谱包络 mag_target_env = spectral_envelope(mag_target, stft_params.lifter_order, stft_params.nfft); % 3. 计算源音频的频谱细节(改进方法) mag_source_detail = spectral_detail(mag_source, mag_source_env); % 4. 应用转换:目标包络 + 源细节 mag_new = mag_target_env .* mag_source_detail; % 5. 频谱整形(增强音色特征) mag_new = spectral_shaping(mag_new, mag_source_env, mag_target_env); % % 6. 相位处理(直接使用目标相位) % phase_new = phase_target; % update_diag('使用目标音频相位'); % === 改进的相位处理 === if any(transients) update_diag('瞬态区域使用目标相位'); phase_new = phase_target; % 瞬态区域直接使用目标相位 else update_diag('非瞬态区域重建相位'); phase_new = phase_reconstruction(mag_new, phase_target, stft_params); end % === 重建音频 === update_diag('重建音频(ISTFT)...'); update_progress(0.90, 'ISTFT重建'); converted_audio = optimized_istft(mag_new, phase_new, stft_params, @update_progress); converted_audio = converted_audio / max(abs(converted_audio)); % 归一化 % === 添加后处理 === converted_audio = post_process(converted_audio, fs, source_audio, target_audio); % 确保长度匹配 if length(converted_audio) > target_length converted_audio = converted_audio(1:target_length); elseif length(converted_audio) < target_length converted_audio = [converted_audio; zeros(target_length - length(converted_audio), 1)]; end % 显示结果 plot(ax2, (0:length(converted_audio)-1)/fs, converted_audio); title(ax2, '转换后音频波形'); xlabel(ax2, '时间 (s)'); ylabel(ax2, '幅度'); grid(ax2, 'on'); % 显示转换后频谱 show_spectrum(ax6, converted_audio, fs, stft_params, '转换后频谱'); % 更新状态 update_progress(1.0, '转换完成'); set(status_text, 'String', '音色转换完成!'); update_diag('音色转换成功!', true); % 设置完成标志 conversion_complete = true; % 清理大内存变量 clear mag_target mag_source mag_new; catch e errordlg(['音色转换失败: ', e.message], '错误'); update_diag(['错误: ', e.message], true); set(progress_bar, 'FaceColor', [1, 0.3, 0.3]); set(progress_text, 'String', '处理失败'); end % 重置处理状态 processing = false; end %% === 后处理函数 === function y = post_process(x, fs, source, target) % 1. 瞬态增强 y = transient_enhancement(x, fs); % 2. 频谱均衡 if ~isempty(source) && ~isempty(target) y = spectral_eq(y, fs, source, target); end % 3. 动态范围控制 y = dynamic_range_control(y, fs); % 4. 最终归一化 y = y / max(abs(y)); end %% === 瞬态增强函数 === function y = transient_enhancement(x, fs) % 瞬态检测和增强 envelope = abs(hilbert(x)); diff_env = diff(envelope); diff_env = [diff_env(1); diff_env]; threshold = 0.1 * max(abs(diff_env)); transients = abs(diff_env) > threshold; attack_time = 0.005; decay_time = 0.05; attack_samples = round(attack_time * fs); decay_samples = round(decay_time * fs); gain_vector = ones(size(x)); transient_starts = find(diff([0; transients]) == 1); for i = 1:length(transient_starts) start_idx = transient_starts(i); end_idx = min(start_idx + attack_samples + decay_samples, length(x)); attack_phase = linspace(1, 1.8, attack_samples)'; decay_phase = linspace(1.8, 1, decay_samples)'; full_phase = [attack_phase; decay_phase]; valid_length = min(length(full_phase), end_idx - start_idx); gain_vector(start_idx:start_idx+valid_length-1) = full_phase(1:valid_length); end y = x .* gain_vector; y = y / max(abs(y)) * 0.98; limiter_threshold = 0.95; y(y > limiter_threshold) = limiter_threshold; y(y < -limiter_threshold) = -limiter_threshold; end %% === 修正后的频谱均衡函数 === function y = spectral_eq(y, fs, source, target) % 基于源和目标频谱特性的均衡 (绕过graphicEQ) % 1. 频谱分析参数 window_size = 1024; overlap = 512; nfft = 1024; % 2. 计算源音频平均频谱 [~, F, ~, Pxx_source] = spectrogram(... source, hamming(window_size), overlap, nfft, fs, 'power'); Pxx_source = mean(Pxx_source, 2); % 3. 计算目标音频平均频谱 [~, ~, ~, Pxx_target] = spectrogram(... target, hamming(window_size), overlap, nfft, fs, 'power'); Pxx_target = mean(Pxx_target, 2); % 4. 计算期望增益 (幅度比) desired_gain = sqrt(Pxx_target ./ (Pxx_source + eps)); gains_db = 20*log10(desired_gain); % 转换为dB % 5. 设计均衡滤波器组 (绕过graphicEQ) eq_filter = design_robust_eq(F, gains_db, fs); % 6. 应用均衡 y = filter(eq_filter, y); end %% 鲁棒的均衡滤波器设计函数 function Hd = design_robust_eq(F, gains_db, fs) % 方法1:尝试使用designfilt的graphiceq选项 try Hd = designfilt('graphiceq', ... 'SampleRate', fs, ... 'Frequencies', F, ... 'Gains', gains_db, ... 'Bandwidth', 1/3); % 1/3倍频程带宽 return; catch end % 方法2:手动创建级联滤波器组 try % 初始化滤波器集合 filters = {}; % 为每个频点设计峰值滤波器 for i = 1:length(F) % 跳过无效频点 if F(i) <= 0 || F(i) >= fs/2 continue; end % 设计峰值滤波器 [b, a] = peaking_filter(F(i), gains_db(i), fs); % 添加到滤波器集合 filters{end+1} = dfilt.df2(b, a); %#ok<AGROW> end % 级联所有滤波器 Hd = dfilt.cascade(filters{:}); return; catch end % 方法3:基础FIR均衡器 (最后防线) try % 创建目标频率响应 freq_vector = linspace(0, fs/2, 1024); gain_vector = interp1(F, gains_db, freq_vector, 'pchip', 'extrap'); mag_response = 10.^(gain_vector/20); % dB转幅度 % 设计FIR滤波器 Hd = designfilt('arbmagfir', ... 'FilterOrder', 256, ... 'Frequencies', freq_vector, ... 'Amplitudes', mag_response, ... 'SampleRate', fs); return; catch end % 所有方法均失败时返回直通滤波器 warning('所有均衡器设计方法均失败,返回直通滤波器'); Hd = dfilt.dffir(1); % 增益为1的FIR滤波器 end %% 峰值滤波器设计函数 function [b, a] = peaking_filter(fc, G, fs, Q) % 参数默认值 if nargin < 4 Q = 1.5; % 默认品质因数 end % 标准化频率 w0 = 2*pi*fc/fs; alpha = sin(w0)/(2*Q); % 增益线性转换 A = 10^(G/40); % 滤波器系数 b0 = 1 + alpha*A; b1 = -2*cos(w0); b2 = 1 - alpha*A; a0 = 1 + alpha/A; a1 = -2*cos(w0); a2 = 1 - alpha/A; % 归一化系数 b = [b0, b1, b2] / a0; a = [a0, a1, a2] / a0; end %% === 动态范围控制函数 === function y = dynamic_range_control(y, fs) compressor1 = compressor(... 'SampleRate', fs, ... 'Threshold', -20, ... 'Ratio', 2, ... 'KneeWidth', 6, ... 'AttackTime', 0.02, ... 'ReleaseTime', 0.1, ... 'MakeUpGainMode', 'Auto'); y = compressor1(y); end %% === 瞬态检测函数 === function transients = detect_transients(audio, win_len, hop_size) % 基于能量变化的瞬态检测 num_frames = floor((length(audio)-win_len)/hop_size) + 1; transients = false(num_frames, 1); prev_energy = 0; threshold = 0.3; % 能量变化阈值 for i = 1:num_frames start_idx = (i-1)*hop_size + 1; end_idx = start_idx + win_len - 1; frame = audio(start_idx:end_idx); frame_energy = sum(frame.^2); if i > 1 energy_diff = (frame_energy - prev_energy) / prev_energy; if energy_diff > threshold transients(i) = true; end end prev_energy = frame_energy; end end %% === 新增相位重建函数 === function phase = phase_reconstruction(mag, init_phase, params) % Griffin-Lim相位重建算法 num_iter = params.phase_iter; [num_bins, ~] = size(mag); current_phase = init_phase; for iter = 1:num_iter % 1. 创建复数频谱 S_half = mag .* exp(1i*current_phase); % 2. 重建时域信号 x_recon = optimized_istft(mag, current_phase, params, []); % 3. 重新计算STFT [~, new_phase] = optimized_stft(x_recon, params, []); % 4. 更新相位 current_phase = new_phase; end phase = current_phase; end % 替换原来的 spectral_envelope 函数 %% === 改进的频谱包络提取函数 === function env = spectral_envelope(mag, lifter_order, nfft) % 使用倒谱分析提取更准确的包络 [num_bins, num_frames] = size(mag); env = zeros(size(mag)); for i = 1:num_frames % 1. 对数幅度谱 log_mag = log(mag(:, i) + eps); % 2. 计算倒谱 cepstrum = real(ifft(log_mag, nfft)); % 3. 提升:保留低频部分(包络) cepstrum(lifter_order+1:end-lifter_order+1) = 0; % 4. 重建包络 env_frame = real(fft(cepstrum, nfft)); env(:, i) = exp(env_frame(1:num_bins)); end % 5. 时域平滑 env = movmean(env, 3, 2); end %% === 修改播放按钮回调 === function play_target_audio(~, ~) % 禁用播放按钮避免重复点击 set(handles.play_target_btn, 'Enable', 'off'); drawnow; try if isempty(target_audio) errordlg('目标音频为空!请先导入目标音频。', '播放错误'); return; end % 在后台线程中播放 parfeval(@() play_audio(target_audio, fs), 0); catch e errordlg(['播放失败: ' e.message], '播放错误'); end % 延迟后重新启用按钮 pause(1); % 防止立即重复点击 set(handles.play_target_btn, 'Enable', 'on'); end %% === 修改播放按钮回调 === function play_converted_audio(~, ~) % 禁用播放按钮避免重复点击 set(handles.play_target_btn, 'Enable', 'off'); drawnow; try if isempty(converted_audio) errordlg('目标音频为空!请先导入目标音频。', '播放错误'); return; end % 在后台线程中播放 parfeval(@() play_audio(converted_audio, fs), 0); catch e errordlg(['播放失败: ' e.message], '播放错误'); end % 延迟后重新启用按钮 pause(1); % 防止立即重复点击 set(handles.play_target_btn, 'Enable', 'on'); end % 进度更新函数 function update_progress(progress, message) if nargin >= 1 set(progress_bar, 'XData', [0, progress, progress, 0]); end if nargin >= 2 set(progress_text, 'String', message); set(status_text, 'String', message); end if nargin == 1 set(progress_text, 'String', sprintf('%.0f%%', progress*100)); end % 强制刷新界面 drawnow limitrate; end %% === 修改 play_audio 函数 === function play_audio(audio, fs) % 使用持久变量存储播放器状态 persistent player persistent is_playing % 初始化状态 if isempty(is_playing) is_playing = false; end % === 停止当前播放 === if is_playing && ~isempty(player) && isplaying(player) stop(player); is_playing = false; update_diag('停止当前播放'); end % === 增强音频验证 === if isempty(audio) || all(audio == 0) errordlg('无效的音频数据!', '播放错误'); update_diag('播放失败: 无效音频数据', true); return; end % 检查音频数据范围 if max(abs(audio)) > 1.5 audio = audio / max(abs(audio)); update_diag('音频已归一化', false); end % === 异步播放实现 === try % 确保使用列向量 if ~iscolumn(audio) audio = audio(:); end % 创建新播放器对象 player = audioplayer(audio, fs); % 设置回调函数 set(player, 'StartFcn', @play_start_callback); set(player, 'StopFcn', @play_stop_callback); set(player, 'TimerFcn', @play_progress_callback); set(player, 'TimerPeriod', 0.1); % 100ms更新一次 % 异步播放 play(player); is_playing = true; catch e % 错误处理 errordlg(['播放失败: ' e.message], '播放错误'); update_diag(['播放错误: ' e.message], true); % 尝试系统命令播放 try_system_play(audio, fs); end end %% === 播放回调函数 === function play_start_callback(obj, ~) % 播放开始回调 duration = obj.TotalSamples / obj.SampleRate; status_msg = sprintf('开始播放 (%.1f秒)', duration); set(status_text, 'String', status_msg); update_diag(status_msg, false); end function play_stop_callback(~, ~) % 播放结束回调 set(status_text, 'String', '播放完成'); update_diag('播放完成', false); end function play_progress_callback(obj, ~) % 播放进度回调 if isplaying(obj) current = obj.CurrentSample; total = obj.TotalSamples; fs = obj.SampleRate; elapsed = current / fs; total_time = total / fs; progress = current / total; set(progress_bar, 'XData', [0, progress, progress, 0]); set(progress_text, 'String', sprintf('播放进度: %.0f%%', progress*100)); status = sprintf('播放中: %.1f/%.1f秒', elapsed, total_time); set(status_text, 'String', status); end end %% === 备用播放方案 === function try_system_play(audio, fs) % 当audioplayer失败时使用系统命令播放 try % 保存临时音频文件 temp_file = [tempname '.wav']; audiowrite(temp_file, audio, fs); % 跨平台播放命令 if ispc % Windows系统 system(['start "" "' temp_file '"']); elseif ismac % macOS系统 system(['afplay "' temp_file '" &']); else % Linux系统 system(['aplay "' temp_file '" &']); end update_diag(['使用系统命令播放: ' temp_file], true); set(status_text, 'String', '使用外部播放器播放音频'); catch e errordlg(['备用播放失败: ' e.message], '播放错误'); update_diag(['备用播放失败: ' e.message], true); end end function stop_audio(~, ~) try % 获取当前所有播放器对象 allPlayers = findall(0, 'Type', 'audioplayer'); % 停止并删除所有播放器 for i = 1:length(allPlayers) if isplaying(allPlayers(i)) stop(allPlayers(i)); end delete(allPlayers(i)); end set(status_text, 'String', '播放已停止'); update_diag('音频播放已停止', true); catch e errordlg(['停止播放失败: ', e.message], '错误'); update_diag(['停止播放错误: ', e.message], true); end end % 保存音频函数 function save_audio(~, ~) if processing errordlg('处理正在进行中,请稍后保存', '错误'); return; end if isempty(converted_audio) errordlg('没有转换后的音频可保存!', '错误'); return; end [file, path] = uiputfile('*.wav', '保存转换音频'); if isequal(file, 0), return; end set(status_text, 'String', '正在保存音频...'); update_diag(['开始保存: ', file], true); try % 直接保存音频 filename = fullfile(path, file); audiowrite(filename, converted_audio, fs); set(status_text, 'String', ['已保存: ', file]); update_diag(['音频已保存: ', filename], true); catch e errordlg(['保存失败: ', e.message], '极错误'); update_diag(['保存错误: ', e.message], true); end end function show_spectrum(ax, audio, fs, params, title_str) try % 检查输入音频 if isempty(audio) || length(audio) < params.win_len error('无效音频数据: 长度=%d, 需要≥%d', length(audio), params.win_len); end % 计算STFT [~, ~, f, t] = optimized_stft(audio, params, []); % 直接使用optimized_stft的维度验证 [mag, ~, f, t] = optimized_stft(audio, params, []); spec_data = 10*log10(abs(mag) + eps); % 绘图 cla(ax); imagesc(ax, t, f, spec_data); % 坐标轴设置 set(ax, 'YDir', 'normal'); axis(ax, 'tight'); ylim(ax, [0, fs/2]); % 限制到Nyquist频率 title(ax, [title_str, sprintf(' (%.1f秒)', length(audio)/fs)]); xlabel(ax, '时间 (s)'); ylabel(ax, '频率 (Hz)'); colorbar(ax); colormap(ax, 'jet'); catch e % 错误处理 cla(ax); err_msg = sprintf('频谱错误: %s\n音频尺寸: %dx%d', e.message, size(audio,1), size(audio,2)); text(ax, 0.5, 0.5, err_msg, ... 'HorizontalAlignment', 'center', ... 'Color', 'red', ... 'FontSize', 9); title(ax, [title_str, ' (错误)']); end end end %% === 单帧相位重建函数 === function phase_frame = frame_phase_reconstruction(mag_frame, init_phase, params) % 基于MAGNA方法的快速单帧相位重建 num_iter = 3; % 减少迭代次数 nfft = params.nfft; phase_frame = init_phase; % 创建初始复数频谱 S_half = mag_frame .* exp(1i*phase_frame); % 创建完整频谱 S_full = zeros(nfft, 1); S_full(1:length(mag_frame)) = S_half; S_full(end-length(mag_frame)+2:end) = conj(S_half(end-1:-1:2)); for iter = 1:num_iter % 1. 逆FFT frame = real(ifft(S_full)); % 2. 正向FFT S_full_new = fft(frame, nfft); % 3. 保持幅度,更新相位 S_full_new = mag_frame .* exp(1i*angle(S_full_new(1:length(mag_frame)))); % 4. 重建完整频谱 S_full(1:length(mag_frame)) = S_full_new; S_full(end-length(mag_frame)+2:end) = conj(S_full_new(end-1:-1:2)); end phase_frame = angle(S_full_new); end %% === 改进的频谱细节函数 === function detail = spectral_detail(mag, env) % 更自然的细节提取 alpha = 0.5; % 细节增强因子 beta = 0.1; % 噪声抑制因子 % 基础细节计算 base_detail = mag ./ (env + eps); % 应用压缩函数避免极端值 detail = tanh(alpha * base_detail) / tanh(alpha); % 频域平滑 detail = imgaussfilt(detail, 1.5); end %% === 改进的频谱整形 === function mag_out = spectral_shaping(mag, env_source, env_target) % 更保守的频谱整形 balance_factor = 0.5; % 降低源音色特征强度 % 计算频谱比例因子(带限) freq_range = 1:min(100, size(mag,1)); % 只影响低频区域 ratio = ones(size(mag)); ratio(freq_range, :) = (env_source(freq_range, :) ./ env_target(freq_range, :)).^balance_factor; % 限制比例范围 ratio = min(max(ratio, 0.8), 1.2); % 应用比例因子 mag_out = mag .* ratio; end %% === 核心音频处理函数 === function [mag, phase, f, t] = optimized_stft(x, params, progress_callback) % 参数提取 win_len = params.win_len; hop_size = params.hop_size; nfft = params.nfft; fs = params.fs; % 输入验证 if isempty(x) || length(x) < win_len error('无效输入: 信号长度(%d) < 窗长(%d)', length(x), win_len); end % 创建窗函数 win = hann(win_len, 'periodic'); % 计算帧数 num_frames = floor((length(x) - win_len) / hop_size) + 1; % 初始化STFT矩阵 stft_matrix = zeros(nfft, num_frames); % === 关键修复: 正确的时间向量计算 === % 每帧的中心时间点 (秒) t = ((0:num_frames-1) * hop_size + win_len/2) / fs; % 进行STFT for i = 1:num_frames start_idx = (i-1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, length(x)); segment = x(start_idx:end_idx); % 零填充短于窗长的段 if length(segment) < win_len segment = [segment; zeros(win_len - length(segment), 1)]; end segment = segment .* win; X = fft(segment, nfft); stft_matrix(:, i) = X; % 进度更新 if ~isempty(progress_callback) progress = i / num_frames; progress_callback(progress); end end % 取单边频谱 num_freq_bins = floor(nfft/2) + 1; stft_matrix = stft_matrix(1:num_freq_bins, :); % 计算幅度和相位 mag = abs(stft_matrix); phase = angle(stft_matrix); % 频率向量 (Hz) f = (0:num_freq_bins-1)' * (fs / nfft); % === 维度验证 === assert(size(mag, 1) == length(f), ... '频率维度不匹配: mag行数=%d, f长度=%d', size(mag,1), length(f)); assert(size(mag, 2) == length(t), ... '时间维度不匹配: mag列数=%d, t长度=%d', size(mag,2), length(t)); end function x_recon = optimized_istft(mag, phase, params, progress_callback) % 优化的逆短时傅里叶变换(ISTFT)实现 % 输入: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % params - 参数结构体 % progress_callback - 进度回调函数 % 输出: % x_recon - 重建的时域信号 % === 输入验证增强 === if isempty(mag) || isempty(phase) error('ISTFT输入为空'); end % === 参数提取 === nfft = params.nfft; win_len = params.win_len; hop_size = win_len - params.overlap; win_synth = params.win_synthesis; [num_bins, num_frames] = size(mag); % 计算信号总长度 total_samples = (num_frames - 1) * hop_size + win_len; x_recon = zeros(total_samples, 1); % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 重建复数频谱 === S_half = mag .* exp(1i * phase); % === 创建双边谱 === S_full = zeros(nfft, num_frames); if rem(nfft, 2) % 奇数点FFT S_full(1:num_bins, :) = S_half; S_full(num_bins+1:end, :) = conj(S_half(end:-1:2, :)); else % 偶数点FFT S_full(1:num_bins, :) = S_half; % 注意:Nyquist点处理 S_full(num_bins+1:end, :) = conj(S_half(end-1:-1:2, :)); end % === 执行逆FFT和重叠相加 === for frame_idx = 1:num_frames % 1. 逆FFT frame = real(ifft(S_full(:, frame_idx), nfft)); % 2. 应用合成窗 frame_win = frame(1:win_len) .* win_synth; % 3. 计算位置并叠加 start_idx = (frame_idx - 1) * hop_size + 1; end_idx = start_idx + win_len - 1; % 确保不越界 if end_idx > total_samples end_idx = total_samples; frame_win = frame_win(1:(end_idx - start_idx + 1)); end % 重叠相加 x_recon(start_idx:end_idx) = x_recon(start_idx:end_idx) + frame_win; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('ISTFT重建: %d/%d', frame_idx, num_frames)); end end % === 归一化处理 === % 计算重叠因子 overlap_factor = win_len / hop_size; % 计算归一化窗口 norm_win = zeros(total_samples, 1); for i = 1:num_frames start_idx = (i - 1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, total_samples); norm_win(start_idx:end_idx) = norm_win(start_idx:end_idx) + win_synth(1:(end_idx-start_idx+1)).^2; end % 避免除以零 norm_win(norm_win < eps) = eps; % 应用归一化 x_recon = x_recon ./ norm_win; end 在这个代码中,运行时总是卡在ISTFT中,每次卡的进度值不一样,请修改

filetype

% 简化版频谱显示函数(不再包含包络计算) function show_spectrum(ax, audio, fs, params, title_str) try % 计算STFT [S, f, t] = optimized_stft(audio, params, []); % 处理频谱数据 mag = abs(S); spec_data = 10*log10(mag + eps); % === 增强的维度验证 === % 确保频率向量是列向量 if ~iscolumn(f) f = f(:); end % 确保时间向量是行向量 if ~isrow(t) t = t(:)'; end % === 维度一致性检查 === if size(spec_data, 1) ~= length(f) || size(spec_data, 2) ~= length(t) min_bins = min(size(spec_data, 1), length(f)); min_frames = min(size(spec_data, 2), length(t)); spec_data = spec_data(1:min_bins, 1:min_frames); f = f(1:min_bins); t = t(1:min_frames); update_diag(sprintf('维度自动调整: spec_data(%d×%d), f(%d), t(%d)',... size(spec_data,1), size(spec_data,2), length(f), length(t))); end % === 坐标值验证 === % 确保频率在合理范围内 nyquist = fs/2; if any(f > nyquist) f(f > nyquist) = nyquist; end % 清除旧内容 cla(ax); % === 绘制频谱图 === imagesc(ax, t, f, spec_data); % === 设置坐标轴属性 === set(ax, 'YDir', 'normal'); % 低频在底部 axis(ax, 'tight'); % 自动调整坐标轴范围 % === 设置显示属性 === title(ax, title_str); xlabel(ax, '时间 (s)'); ylabel(ax, '频率 (Hz)'); colorbar(ax); colormap(ax, 'jet'); % 设置频率范围 max_freq = min(fs/2, max(f)); ylim(ax, [min(f), max_freq]); % 添加诊断信息 update_diag(sprintf('频谱显示成功: %s (尺寸: %d×%d)', title_str, size(spec_data,1), size(spec_data,2))); catch e % 详细的错误信息 dim_info = sprintf('维度: spec_data(%d×%d), f(%d), t(%d)',... size(spec_data,1), size(spec_data,2), length(f), length(t)); err_msg = sprintf('频谱错误: %s\n%s', e.message, dim_info); % 显示错误信息 cla(ax); text(ax, 0.5, 0.5, err_msg, ... 'HorizontalAlignment', 'center', ... 'FontSize', 8, 'Color', 'red'); title(ax, [title_str, ' (错误)']); % 在诊断信息中记录详细错误 update_diag(['频谱显示错误: ' err_msg], true); end end end %% === 核心音频处理函数 === function [mag, phase, f, t] = optimized_stft(x, params, progress_callback) % 优化的短时傅里叶变换(STFT)实现 % 输入: % x - 时域信号 % params - 参数结构体 (包含窗长、重叠、FFT点数、窗函数等) % progress_callback - 进度回调函数 % 输出: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % f - 频率向量 (Hz) % t - 时间向量 (秒) % === 参数提取 === win_len = params.win_len; overlap = params.overlap; nfft = params.nfft; win_anal = params.window; fs = params.fs; hop_size = win_len - overlap; % === 计算STFT的帧数 === num_samples = length(x); num_frames = floor((num_samples - overlap) / hop_size); % === 初始化STFT矩阵 === S = zeros(nfft, num_frames); % 完整的双边谱 % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 分帧处理 === for frame_idx = 1:num_frames % 1. 计算当前帧的起始和结束索引 start_idx = (frame_idx - 1) * hop_size + 1; end_idx = start_idx + win_len - 1; % 边界处理:如果最后一帧超出信号长度,则截断 if end_idx > num_samples frame = [x(start_idx:end); zeros(end_idx - num_samples, 1)]; else frame = x(start_idx:end_idx); end % 2. 加窗 frame_win = frame .* win_anal; % 3. FFT S_frame = fft(frame_win, nfft); S(:, frame_idx) = S_frame; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('STFT计算: %d/%d', frame_idx, num_frames)); end end % === 计算单边谱 === if rem(nfft, 2) % 奇数点FFT num_bins = (nfft+1)/2; else num_bins = nfft/2 + 1; end S_half = S(1:num_bins, :); % 单边谱 % 幅度和相位 mag = abs(S_half); phase = angle(S_half); % 频率向量 f = (0:num_bins-1) * fs / nfft; % 时间向量 t = (0:num_frames-1) * hop_size / fs; end function x_recon = optimized_istft(mag, phase, params, progress_callback) % 优化的逆短时傅里叶变换(ISTFT)实现 % 输入: % mag - 幅度谱 (单边谱) % phase - 相位谱 (单边谱) % params - 参数结构体 % progress_callback - 进度回调函数 % 输出: % x_recon - 重建的时域信号 % === 参数提取 === nfft = params.nfft; win_len = params.win_len; hop_size = win_len - params.overlap; win_synth = params.win_synthesis; [num_bins, num_frames] = size(mag); % 计算信号总长度 total_samples = (num_frames - 1) * hop_size + win_len; x_recon = zeros(total_samples, 1); % 进度更新间隔 update_interval = max(1, floor(num_frames/10)); % === 重建复数频谱 === S_half = mag .* exp(1i * phase); % === 创建双边谱 === S_full = zeros(nfft, num_frames); if rem(nfft, 2) % 奇数点FFT S_full(1:num_bins, :) = S_half; S_full(num_bins+1:end, :) = conj(S_half(end:-1:2, :)); else % 偶数点FFT S_full(1:num_bins, :) = S_half; % 注意:Nyquist点处理 S_full(num_bins+1:end, :) = conj(S_half(end-1:-1:2, :)); end % === 执行逆FFT和重叠相加 === for frame_idx = 1:num_frames % 1. 逆FFT frame = real(ifft(S_full(:, frame_idx), nfft)); % 2. 应用合成窗 frame_win = frame(1:win_len) .* win_synth; % 3. 计算位置并叠加 start_idx = (frame_idx - 1) * hop_size + 1; end_idx = start_idx + win_len - 1; % 确保不越界 if end_idx > total_samples end_idx = total_samples; frame_win = frame_win(1:(end_idx - start_idx + 1)); end % 重叠相加 x_recon(start_idx:end_idx) = x_recon(start_idx:end_idx) + frame_win; % 4. 进度更新 if ~isempty(progress_callback) && mod(frame_idx, update_interval) == 0 progress_callback(frame_idx/num_frames * 0.2, ... sprintf('ISTFT重建: %d/%d', frame_idx, num_frames)); end end % === 归一化处理 === % 计算重叠因子 overlap_factor = win_len / hop_size; % 计算归一化窗口 norm_win = zeros(total_samples, 1); for i = 1:num_frames start_idx = (i - 1) * hop_size + 1; end_idx = min(start_idx + win_len - 1, total_samples); norm_win(start_idx:end_idx) = norm_win(start_idx:end_idx) + win_synth(1:(end_idx-start_idx+1)).^2; end % 避免除以零 norm_win(norm_win < eps) = eps; % 应用归一化 x_recon = x_recon ./ norm_win; end 刚才理解错了我的意思,这是我原来的代码,之前的对话中让你重写了完整代码,但回答被终止在显示函数了,请根据我提供的代码,修正之后给出显示函数之后所有的完整代码,包含所有的部分

filetype

function audio_pitch_correction % 创建主GUI界面 fig = uifigure('Name', '音频音准矫正系统', 'Position', [100 100 900 700]); % 创建音频选择区域 uilabel(fig, 'Position', [50 680 300 20], 'Text', '待矫正音频来源:', 'FontWeight', 'bold'); % 创建录音选项按钮组 source_btn_group = uibuttongroup(fig, 'Position', [50 630 300 40], 'Title', ''); uibutton(source_btn_group, 'Position', [10 10 130 30], 'Text', '导入音频文件', ... 'ButtonPushedFcn', @(btn,event) select_audio(fig, 'source')); uibutton(source_btn_group, 'Position', [160 10 130 30], 'Text', '录制音频', ... 'ButtonPushedFcn', @(btn,event) record_audio(fig)); % 创建参考音频选择按钮 uilabel(fig, 'Position', [400 680 300 20], 'Text', '参考音频来源:', 'FontWeight', 'bold'); uibutton(fig, 'Position', [400 630 150 30], 'Text', '导入参考音频', ... 'ButtonPushedFcn', @(btn,event) select_audio(fig, 'reference')); % 创建处理按钮 process_btn = uibutton(fig, 'Position', [600 630 150 30], ... 'Text', '开始矫正', 'Enable', 'off', ... 'ButtonPushedFcn', @(btn,event) process_audio(fig)); % 创建播放和保存按钮 uibutton(fig, 'Position', [50 580 150 30], 'Text', '播放原始音频', ... 'ButtonPushedFcn', @(btn,event) play_audio(fig, 'source')); uibutton(fig, 'Position', [250 580 150 30], 'Text', '播放矫正音频', ... 'ButtonPushedFcn', @(btn,event) play_audio(fig, 'corrected')); uibutton(fig, 'Position', [450 580 150 30], 'Text', '保存矫正音频', ... 'ButtonPushedFcn', @(btn,event) save_audio(fig)); % 创建录音状态显示 recording_label = uilabel(fig, 'Position', [650 580 200 30], ... 'Text', '准备录音', 'FontColor', [0 0.5 0]); % 创建波形显示区域 ax_source = uiaxes(fig, 'Position', [50 350 800 150]); title(ax_source, '待矫正音频波形'); ax_reference = uiaxes(fig, 'Position', [50 180 800 150]); title(ax_reference, '参考音频波形'); ax_corrected = uiaxes(fig, 'Position', [50 10 800 150]); title(ax_corrected, '矫正后音频波形'); % 存储数据 fig.UserData.source_audio = []; fig.UserData.reference_audio = []; fig.UserData.corrected_audio = []; fig.UserData.fs = 44100; % 默认采样率 fig.UserData.process_btn = process_btn; fig.UserData.axes = struct('source', ax_source, 'reference', ax_reference, 'corrected', ax_corrected); fig.UserData.recording_label = recording_label; fig.UserData.recorder = []; % 录音器对象 fig.UserData.timer = []; % 计时器对象 end function select_audio(fig, audio_type) [file, path] = uigetfile({'*.wav;*.mp3;*.ogg;*.flac', ... '音频文件 (*.wav,*.mp3,*.ogg,*.flac)'}); if isequal(file, 0) return; end filename = fullfile(path, file); [audio, fs] = audioread(filename); % 处理立体声:转换为单声道 if size(audio, 2) > 1 audio = mean(audio, 2); end % 截取前20秒 max_samples = min(20*fs, length(audio)); audio = audio(1:max_samples); % 存储数据 fig.UserData.([audio_type '_audio']) = audio; fig.UserData.fs = fs; % 更新波形显示 ax = fig.UserData.axes.(audio_type); plot(ax, (1:length(audio))/fs, audio); xlabel(ax, '时间 (s)'); ylabel(ax, '幅度'); % 启用处理按钮 if ~isempty(fig.UserData.source_audio) && ~isempty(fig.UserData.reference_audio) fig.UserData.process_btn.Enable = 'on'; end end function record_audio(fig) % 创建录音界面 record_fig = uifigure('Name', '音频录制', 'Position', [300 300 400 200]); % 录音时长设置 uilabel(record_fig, 'Position', [50 150 100 20], 'Text', '录音时长 (秒):'); duration_edit = uieditfield(record_fig, 'numeric', ... 'Position', [160 150 100 20], 'Value', 5, 'Limits', [1 30]); % 采样率设置 uilabel(record_fig, 'Position', [50 120 100 20], 'Text', '采样率:'); fs_dropdown = uidropdown(record_fig, ... 'Position', [160 120 100 20], ... 'Items', {'8000', '16000', '44100', '48000'}, ... 'Value', '44100'); % 控制按钮 record_btn = uibutton(record_fig, 'Position', [50 70 100 30], ... 'Text', '开始录音', ... 'ButtonPushedFcn', @(btn,event) start_recording(fig, duration_edit.Value, str2double(fs_dropdown.Value))); uibutton(record_fig, 'Position', [160 70 100 30], ... 'Text', '停止录音', ... 'ButtonPushedFcn', @(btn,event) stop_recording(fig)); uibutton(record_fig, 'Position', [270 70 100 30], ... 'Text', '关闭', ... 'ButtonPushedFcn', @(btn,event) close(record_fig)); end function start_recording(fig, duration, fs) % 更新状态 fig.UserData.recording_label.Text = '录音中...'; fig.UserData.recording_label.FontColor = [1 0 0]; drawnow; % 创建录音器对象 recorder = audiorecorder(fs, 16, 1); % 16-bit, 单声道 % 设置录音时长 fig.UserData.recorder = recorder; fig.UserData.fs = fs; % 开始录音 record(recorder, duration); % 创建计时器显示剩余时间 t = timer('ExecutionMode', 'fixedRate', 'Period', 1, ... 'TasksToExecute', duration, ... 'TimerFcn', @(t,~) update_recording_timer(fig, t, duration)); start(t); % 存储计时器 fig.UserData.timer = t; end function update_recording_timer(fig, t, total_duration) elapsed = t.TasksExecuted; remaining = total_duration - elapsed; fig.UserData.recording_label.Text = sprintf('录音中: %d秒', remaining); % 录音结束时自动停止 if remaining <= 0 stop_recording(fig); end end function stop_recording(fig) if ~isempty(fig.UserData.recorder) && isrecording(fig.UserData.recorder) stop(fig.UserData.recorder); end % 停止计时器 if ~isempty(fig.UserData.timer) && isvalid(fig.UserData.timer) stop(fig.UserData.timer); delete(fig.UserData.timer); fig.UserData.timer = []; end % 获取录音数据 audio = getaudiodata(fig.UserData.recorder); fs = fig.UserData.fs; % 更新状态 fig.UserData.recording_label.Text = '录音完成!'; fig.UserData.recording_label.FontColor = [0 0.5 0]; % 存储为待矫正音频 fig.UserData.source_audio = audio; % 更新波形显示 ax = fig.UserData.axes.source; plot(ax, (1:length(audio))/fs, audio); title(ax, '录制音频波形'); xlabel(ax, '时间 (s)'); ylabel(ax, '幅度'); % 启用处理按钮 if ~isempty(fig.UserData.reference_audio) fig.UserData.process_btn.Enable = 'on'; end end function process_audio(fig) source = fig.UserData.source_audio; reference = fig.UserData.reference_audio; fs = fig.UserData.fs; % 确保主图窗存在 if ~isvalid(fig) errordlg('主窗口已关闭,无法处理音频!', '处理错误'); return; end % 创建处理进度对话框 h = uiprogressdlg(fig, 'Title', '处理中', 'Message', '音频对齐...', 'Indeterminate', 'on'); % 步骤1:音频对齐 try [aligned_source, aligned_ref] = improved_align_audio(source, reference, fs); catch ME close(h); errordlg(['音频对齐失败: ' ME.message], '处理错误'); return; end % 步骤2:基频提取 h.Message = '提取音高...'; try [f0_source, time_source] = extract_pitch(aligned_source, fs); [f0_ref, time_ref] = extract_pitch(aligned_ref, fs); catch ME close(h); errordlg(['音高提取失败: ' ME.message], '处理错误'); return; end % 步骤3:音调矫正 h.Message = '矫正音调...'; try [corrected, f0_corrected] = correct_pitch(fig, aligned_source, fs, f0_source, f0_ref, time_source, time_ref); catch ME close(h); errordlg(['音高校正失败: ' ME.message], '处理错误'); return; end % 关闭进度对话框 close(h); % === 关键修复 1: 存储矫正结果 === fig.UserData.corrected_audio = corrected; % === 关键修复 2: 更新播放按钮状态 === play_btn = findobj(fig, 'Text', '播放矫正音频'); if ~isempty(play_btn) play_btn.Enable = 'on'; end % 保存结果并更新显示 % 更新原始音频波形图(添加音高曲线) ax_src = fig.UserData.axes.source; cla(ax_src); yyaxis(ax_src, 'left'); plot(ax_src, (1:length(aligned_source))/fs, aligned_source, 'b'); ylabel(ax_src, '幅度'); yyaxis(ax_src, 'right'); plot(ax_src, time_source, f0_source, 'r', 'LineWidth', 1.5); ylabel(ax_src, '频率 (Hz)'); title(ax_src, '原始音频波形与音高'); grid(ax_src, 'on'); % 更新参考音频波形图(添加音高曲线) ax_ref = fig.UserData.axes.reference; cla(ax_ref); yyaxis(ax_ref, 'left'); plot(ax_ref, (1:length(aligned_ref))/fs, aligned_ref, 'g'); ylabel(ax_ref, '幅度'); yyaxis(ax_ref, 'right'); plot(ax_ref, time_ref, f0_ref, 'm', 'LineWidth', 1.5); ylabel(ax_ref, '频率 (Hz)'); title(ax_ref, '参考音频波形与音高'); grid(ax_ref, 'on'); % 更新矫正后音频波形图(添加音高曲线) ax_corr = fig.UserData.axes.corrected; cla(ax_corr); yyaxis(ax_corr, 'left'); plot(ax_corr, (1:length(corrected))/fs, corrected, 'Color', [0.5 0 0.5]); ylabel(ax_corr, '幅度'); yyaxis(ax_corr, 'right'); plot(ax_corr, time_source, f0_corrected, 'Color', [1 0.5 0], 'LineWidth', 2); ylabel(ax_corr, '频率 (Hz)'); title(ax_corr, '矫正后音频波形与音高'); grid(ax_corr, 'on'); % 绘制综合音高对比图 % 修改后的调用:添加音频波形参数 plot_pitch_comparison(time_source, f0_source, time_ref, f0_ref, f0_corrected,... aligned_source, aligned_ref, corrected, fs); fprintf('原始音高平均: %.1f Hz\n', mean(f0_source(f0_source>0))); fprintf('参考音高平均: %.1f Hz\n', mean(f0_ref(f0_ref>0))); fprintf('矫正后音高平均: %.1f Hz\n', mean(f0_corrected(f0_corrected>0))); end function [aligned_src, aligned_ref] = improved_align_audio(src, ref, fs) % 改进的音频对齐方法:使用频谱互相关 win_size = round(0.1 * fs); % 100ms窗口 hop_size = round(0.05 * fs); % 50ms跳跃 % 计算源音频的频谱图 [S_src, ~, t_src] = spectrogram(src, win_size, win_size-hop_size, win_size, fs); % 计算参考音频的频谱图 [S_ref, ~, t_ref] = spectrogram(ref, win_size, win_size-hop_size, win_size, fs); % 计算互相关 n_frames = min(length(t_src), length(t_ref)); corr_vals = zeros(1, n_frames); for i = 1:n_frames spec_src = abs(S_src(:, i)); spec_ref = abs(S_ref(:, i)); corr_vals(i) = dot(spec_src, spec_ref) / (norm(spec_src) * norm(spec_ref)); end % 找到最大相关帧 [~, max_idx] = max(corr_vals); time_diff = t_src(max_idx) - t_ref(max_idx); sample_diff = round(time_diff * fs); % 对齐音频 if sample_diff > 0 aligned_src = src(1:end-sample_diff); aligned_ref = ref(sample_diff+1:end); else aligned_src = src(-sample_diff+1:end); aligned_ref = ref(1:end+sample_diff); end % 确保等长 min_len = min(length(aligned_src), length(aligned_ref)); aligned_src = aligned_src(1:min_len); aligned_ref = aligned_ref(1:min_len); end function mfcc = mfcc_feature(audio, fs, frame_size, hop_size) % 参数验证 if nargin < 4 hop_size = round(frame_size/2); % 默认50%重叠 end % 预处理:预加重 audio = filter([1 -0.97], 1, audio); % 分帧处理 frames = buffer(audio, frame_size, frame_size - hop_size, 'nodelay'); num_frames = size(frames, 2); % 加窗(汉明窗) window = hamming(frame_size); windowed_frames = frames .* repmat(window, 1, num_frames); % 计算功率谱 nfft = 2^nextpow2(frame_size); mag_frames = abs(fft(windowed_frames, nfft)); power_frames = (mag_frames(1:nfft/2+1, :)).^2; % 设计梅尔滤波器组 num_filters = 26; % 滤波器数量 mel_min = 0; % 最小Mel频率 mel_max = 2595 * log10(1 + (fs/2)/700); % 最大Mel频率 % 创建等间隔的Mel频率点 mel_points = linspace(mel_min, mel_max, num_filters + 2); % 将Mel频率转换为线性频率 hz_points = 700 * (10.^(mel_points/2595) - 1); % 转换为FFT bin索引 bin_indices = floor((nfft+1) * hz_points / fs); % 创建梅尔滤波器组 filter_bank = zeros(num_filters, nfft/2+1); for m = 2:num_filters+1 left = bin_indices(m-1); center = bin_indices(m); right = bin_indices(m+1); % 左侧斜坡 for k = left:center-1 filter_bank(m-1, k+1) = (k - left) / (center - left); end % 右侧斜坡 for k = center:right filter_bank(m-1, k+1) = (right - k) / (right - center); end end % 应用梅尔滤波器组 mel_spectrum = filter_bank * power_frames; % 取对数 log_mel = log(mel_spectrum + eps); % 计算DCT得到MFCC系数 mfcc = dct(log_mel); % 保留前13个系数(含能量系数) mfcc = mfcc(1:13, :); % 可选:添加能量特征 energy = log(sum(power_frames) + eps); mfcc(1, :) = energy; % 替换第0阶MFCC为对数能量 % 应用倒谱均值归一化 (CMN) mfcc = mfcc - mean(mfcc, 2); end function [f0, time] = extract_pitch(audio, fs) % 使用改进的自相关方法 frame_size = round(0.05 * fs); hop_size = round(0.025 * fs); n_frames = floor((length(audio) - frame_size) / hop_size) + 1; f0 = zeros(1, n_frames); time = (0:n_frames-1)*hop_size/fs + frame_size/(2*fs); % 预处理:带通滤波和预加重 [b, a] = butter(4, [80, 2000]/(fs/2), 'bandpass'); audio = filtfilt(b, a, audio); audio = filter([1, -0.97], 1, audio); % 预加重 for i = 1:n_frames start_idx = (i-1)*hop_size + 1; frame = audio(start_idx:start_idx+frame_size-1); % 归一化自相关函数 autocorr = xcorr(frame, 'normalized'); autocorr = autocorr(frame_size:end); % 取非负延迟部分 % 寻找第一个显著峰值 [peaks, locs] = findpeaks(autocorr, 'MinPeakHeight', 0.3); if ~isempty(locs) % 找到最低频率的显著峰值 valid_locs = locs(peaks > 0.5*max(peaks)); if ~isempty(valid_locs) tau = valid_locs(1); else [~, tau] = max(autocorr); end else [~, tau] = max(autocorr); end % 二次插值 if tau > 1 && tau < length(autocorr)-1 ac_vals = autocorr(tau-1:tau+1); delta = (ac_vals(1) - ac_vals(3)) / (2*(2*ac_vals(2) - ac_vals(1) - ac_vals(3))); tau = tau + delta; end % 计算基频 f0(i) = fs / tau; end % 后处理:改进的平滑和插值 valid = f0 > 80 & f0 < 1000; f0(~valid) = NaN; f0 = fillmissing(f0, 'movmedian', 10); f0 = fillmissing(f0, 'pchip'); % 谐波增强:验证基频和谐波一致性 for i = 1:length(f0) if ~isnan(f0(i)) % 检查第二谐波是否存在 harmonic_freq = 2*f0(i); harmonic_bin = round(harmonic_freq * frame_size / fs); if harmonic_bin <= frame_size/2 frame_start = (i-1)*hop_size + 1; frame = audio(frame_start:frame_start+frame_size-1); spectrum = abs(fft(frame)); harmonic_strength = spectrum(harmonic_bin+1); fundamental_strength = spectrum(round(f0(i)*frame_size/fs)+1); % 如果谐波强度不足,降低置信度 if harmonic_strength < 0.5*fundamental_strength f0(i) = NaN; end end end end % 最终插值 f0 = fillmissing(f0, 'pchip'); end function [corrected, f0_corrected] = correct_pitch(fig, audio, fs, f0_src, f0_ref, time_src, time_ref) % 创建进度条 h = uiprogressdlg(fig, 'Title', '处理中', 'Message', '音高校正...'); % 动态计算最优段长(基于音高变化率) f0_variation = mean(abs(diff(f0_src(f0_src > 0)))); segment_duration = max(0.1, min(0.5, 0.3/(f0_variation/50 + 0.1))); % 自适应段长 segment_samples = round(segment_duration * fs); n_segments = ceil(length(audio) / segment_samples); corrected = zeros(size(audio)); f0_corrected = zeros(size(f0_src)); % 创建参考音高插值函数(使用形状保持插值) valid_ref = f0_ref > 0; if any(valid_ref) ref_interp = @(t) interp1(time_ref(valid_ref), f0_ref(valid_ref), t, 'pchip', 'extrap'); else ref_interp = @(t) 0; end % 创建音高变化强度因子(基于音高差异) pitch_diff = abs(f0_src - ref_interp(time_src)); pitch_diff(pitch_diff < 20) = 0; % 忽略微小差异 intensity_factor = min(2, 1 + pitch_diff/100); % 1-2倍强度因子 for seg = 1:n_segments h.Value = seg/n_segments; h.Message = sprintf('处理段 %d/%d (%.1f%%)', seg, n_segments, seg/n_segments*100); % 获取当前段 start_idx = max(1, (seg-1)*segment_samples + 1); end_idx = min(length(audio), seg*segment_samples); segment_audio = audio(start_idx:end_idx); % 计算段内平均音高(加权平均) seg_time = time_src(time_src >= (start_idx-1)/fs & time_src <= end_idx/fs); valid_seg = f0_src >= start_idx/fs & f0_src <= end_idx/fs & f0_src > 0; if any(valid_seg) % 计算加权平均(差异大的部分权重更高) weights = intensity_factor(valid_seg); mean_src = sum(f0_src(valid_seg).*weights) / sum(weights); mean_ref = sum(ref_interp(seg_time).*weights) / sum(weights); ratio = mean_ref / mean_src; else ratio = 1; end % 应用强度因子增强变化 seg_intensity = mean(intensity_factor(valid_seg)); ratio = ratio^seg_intensity; % 指数增强 % 限制比例范围(更严格的限制) ratio = max(0.8, min(1.25, ratio)); % 应用增强的相位声码器 try corrected_seg = enhanced_phase_vocoder(segment_audio, ratio, fs); catch ME corrected_seg = segment_audio; end % 存储结果 seg_end = min(length(corrected), start_idx + length(corrected_seg) - 1); corrected(start_idx:seg_end) = corrected_seg(1:min(length(corrected_seg), seg_end-start_idx+1)); % 增强的交叉淡入淡出处理(余弦渐变) if seg > 1 fade_samples = round(0.03 * fs); % 30ms淡入淡出 prev_end = (seg-1)*segment_samples; fade_range = max(1, prev_end-fade_samples+1):prev_end; if fade_range(end) <= length(corrected) && fade_range(1) > 0 fade_in = (1 - cos(linspace(0, pi, fade_samples)))/2; fade_out = (1 + cos(linspace(0, pi, fade_samples)))/2; % 确保长度匹配 if length(fade_in) > length(fade_range) fade_in = fade_in(1:length(fade_range)); fade_out = fade_out(1:length(fade_range)); end % 应用交叉混合 corrected(fade_range) = corrected(fade_range).*fade_out(:) + ... corrected_seg(1:length(fade_range)).*fade_in(:); end end end % 重新提取矫正后的音高 [f0_corrected, time_corr] = extract_pitch(corrected, fs); % 后处理:应用音高导向的平滑滤波器 f0_diff = abs(f0_corrected - ref_interp(time_corr)); smooth_window = max(3, min(15, round(f0_diff/5))); % 根据差异调整平滑窗口 f0_corrected = movmedian(f0_corrected, smooth_window); % 确保数据格式正确 corrected = real(corrected); corrected = corrected / max(abs(corrected)); % 归一化 close(h); end function y = enhanced_phase_vocoder(x, ratio, fs) % 自适应帧长(高频用较短帧,低频用较长帧) avg_pitch = mean(pitch(x, fs)); % 需要pitch函数 frame_size = round(min(4096, max(1024, 2048 * (200/avg_pitch)))); overlap = round(frame_size * 0.75); hop_in = frame_size - overlap; hop_out = round(hop_in * ratio); % 使用改进的STFT处理(汉宁窗) win = hann(frame_size, 'periodic'); [S, f, t] = stft(x, fs, 'Window', win, 'OverlapLength', overlap, 'FFTLength', frame_size); % 相位处理 Y = enhanced_phase_processing(S, hop_in, hop_out, fs); % 重建信号(使用加权重叠相加法) y = istft(Y, fs, 'Window', win, 'OverlapLength', frame_size - hop_out, ... 'FFTLength', frame_size, 'Method', 'wola'); % 长度匹配 if length(y) > length(x) y = y(1:length(x)); elseif length(y) < length(x) y = [y; zeros(length(x)-length(y), 1)]; end % 后处理:谱平滑减少人工痕迹 y = spectral_smoothing(y, fs, ratio); end function y = spectral_smoothing(x, fs, ratio) % 应用低通滤波减少高频人工痕迹 cutoff = min(8000, 20000 / ratio^0.5); % 自适应截止频率 [b, a] = butter(4, cutoff/(fs/2), 'low'); y = filtfilt(b, a, x); end function Y = enhanced_phase_processing(X, hop_in, hop_out, fs) Y = zeros(size(X)); if isempty(X), return; end n_bins = size(X, 1); freq_bins = (0:n_bins-1)' * fs / (2*(n_bins-1)); bin_phase_inc = 2*pi * freq_bins * hop_in / fs; phase_prev = angle(X(:,1)); Y(:,1) = abs(X(:,1)) .* exp(1j*phase_prev); for k = 2:size(X,2) mag = abs(X(:,k)); phase = angle(X(:,k)); % 计算相位增量(考虑瞬时频率) delta_phase = phase - phase_prev - bin_phase_inc; % 相位展开(改进方法) delta_phase = delta_phase - 2*pi*round(delta_phase/(2*pi)); % 计算真实瞬时频率 inst_freq = bin_phase_inc + delta_phase; % 相位累积(考虑时间伸缩) adjusted_phase = phase_prev + inst_freq * hop_out / hop_in; % 相位一致性调整 if k > 2 phase_diff = adjusted_phase - angle(Y(:,k-1)); phase_diff = phase_diff - 2*pi*round(phase_diff/(2*pi)); adjusted_phase = angle(Y(:,k-1)) + phase_diff; end % 合成新帧 Y(:,k) = mag .* exp(1j*adjusted_phase); % 更新前一帧相位 phase_prev = adjusted_phase; end end function plot_pitch_comparison(time_src, f0_src, time_ref, f0_ref, f0_corrected, src_wave, ref_wave, corr_wave, fs) % 确保所有序列长度一致 min_length = min([length(time_src), length(time_ref), length(f0_corrected)]); time_src = time_src(1:min_length); f0_src = f0_src(1:min_length); time_ref = time_ref(1:min_length); f0_ref = f0_ref(1:min_length); f0_corrected = f0_corrected(1:min_length); % 创建综合音高对比图(包含波形和音高) pitch_fig = figure('Name', '音频波形与音高分析', 'Position', [100 100 900 800]); % 原始音频波形 + 音高 subplot(3,1,1); time_wave_src = (1:length(src_wave)) / fs; yyaxis left; plot(time_wave_src, src_wave, 'Color', [0.7 0.7 1], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_src, f0_src, 'b', 'LineWidth', 1.5); hold on; plot(time_ref, f0_ref, 'r--', 'LineWidth', 1.5); hold off; title('原始音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('原始波形', '原始音高', '参考音高', 'Location', 'best'); grid on; % 参考音频波形 + 音高 subplot(3,1,2); time_wave_ref = (1:length(ref_wave)) / fs; yyaxis left; plot(time_wave_ref, ref_wave, 'Color', [1 0.7 0.7], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_ref, f0_ref, 'r', 'LineWidth', 1.5); title('参考音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('参考波形', '参考音高', 'Location', 'best'); grid on; % 矫正后音频波形 + 音高 subplot(3,1,3); time_wave_corr = (1:length(corr_wave)) / fs; yyaxis left; plot(time_wave_corr, corr_wave, 'Color', [0.7 1 0.7], 'LineWidth', 0.5); ylabel('幅度'); ylim([-1.1 1.1]); % 固定幅度范围 yyaxis right; plot(time_src, f0_src, 'b:', 'LineWidth', 1); hold on; plot(time_ref, f0_ref, 'r--', 'LineWidth', 1); plot(time_src, f0_corrected, 'g', 'LineWidth', 2); hold off; title('矫正后音频波形与音高'); xlabel('时间 (s)'); ylabel('频率 (Hz)'); legend('矫正波形', '原始音高', '参考音高', '矫正音高', 'Location', 'best'); grid on; % 添加音高误差分析 valid_idx = (f0_src > 0) & (f0_ref > 0) & (f0_corrected > 0); if any(valid_idx) src_error = mean(abs(f0_src(valid_idx) - f0_ref(valid_idx))); corr_error = mean(abs(f0_corrected(valid_idx) - f0_ref(valid_idx))); annotation(pitch_fig, 'textbox', [0.15 0.05 0.7 0.05], ... 'String', sprintf('原始音高平均误差: %.2f Hz | 矫正后音高平均误差: %.2f Hz | 改进: %.1f%%', ... src_error, corr_error, (src_error - corr_error)/src_error*100), ... 'FitBoxToText', 'on', 'BackgroundColor', [0.9 0.9 0.9], ... 'FontSize', 12, 'HorizontalAlignment', 'center'); end end function play_audio(fig, audio_type) if ~isvalid(fig) errordlg('主窗口无效!', '播放错误'); return; end switch audio_type case 'source' audio = fig.UserData.source_audio; title_text = '播放原始音频'; if isempty(audio) errordlg('未找到原始音频数据!', '播放错误'); return; end case 'corrected' audio = fig.UserData.corrected_audio; title_text = '播放矫正音频'; if isempty(audio) errordlg('请先完成音高校正!', '播放错误'); return; end otherwise return; end fs = fig.UserData.fs; player = audioplayer(audio, fs); % 创建播放控制界面 play_fig = uifigure('Name', title_text, 'Position', [500 500 300 150]); % 播放进度条 ax = uiaxes(play_fig, 'Position', [50 100 200 20]); hold(ax, 'on'); prog_line = plot(ax, [0 0], [0 1], 'b', 'LineWidth', 2); % 垂直范围[0,1] hold(ax, 'off'); xlim(ax, [0 1]); ylim(ax, [0 1]); set(ax, 'XTick', [], 'YTick', []); % 播放时间显示 time_label = uilabel(play_fig, 'Position', [50 80 200 20], ... 'Text', '00:00 / 00:00', 'HorizontalAlignment', 'center'); % 控制按钮 uibutton(play_fig, 'Position', [50 30 60 30], 'Text', '播放', ... 'ButtonPushedFcn', @(btn,event) play(player)); uibutton(play_fig, 'Position', [120 30 60 30], 'Text', '暂停', ... 'ButtonPushedFcn', @(btn,event) pause(player)); uibutton(play_fig, 'Position', [190 30 60 30], 'Text', '停止', ... 'ButtonPushedFcn', @(btn,event) stop(player)); % 总时长计算 total_time = length(audio)/fs; mins = floor(total_time/60); secs = round(total_time - mins*60); total_str = sprintf('%02d:%02d', mins, secs); % 更新播放进度回调 player.TimerFcn = {@update_playback, play_fig, time_label, total_str, prog_line, length(audio)}; player.TimerPeriod = 0.1; % 更新频率(秒) player.StopFcn = @(src,event) stop_playback(src, event, play_fig); end function stop_playback(src, ~, fig) stop(src); if isvalid(fig) close(fig); end end function save_audio(fig) if ~isvalid(fig) || isempty(fig.UserData.corrected_audio) errordlg('无有效音频数据可保存!', '保存错误'); return; end [file, path] = uiputfile('*.wav', '保存矫正音频'); if isequal(file, 0), return; end audiowrite(fullfile(path, file), fig.UserData.corrected_audio, fig.UserData.fs); msgbox('音频保存成功!', '完成'); end 矫正失败,显示Y应为实数

JonSco
  • 粉丝: 113
上传资源 快速赚钱