所以我只需要将adjusted_fft_frames = mag_frames * np.exp(1j * 2 * np.pi * pitch_freqs_smoothed[:, np.newaxis] * freqs[np.newaxis, :])这段代码替换成pitch_freqs_smoothed = pitch_freqs_smoothed[:, np.newaxis]pitch_freqs_smoothed = np.tile(pitch_freqs_smoothed, (1, mag_frames.shape[1]))adjusted_fft_frames = mag_frames * np.exp(1j * 2 * np.pi * pitch_freqs_smoothed * freqs[np.newaxis, :])就行了吗
时间: 2024-02-12 15:08:05 浏览: 141
是的,你可以将原来的代码替换为新的代码,但是需要注意一下 `freqs` 数组的长度是否正确,以确保可以正确进行广播操作。同时,也需要确保 `pitch_freqs_smoothed` 被正确地转换为 `(198, 882)` 的形状。
完整的代码修改如下:
```
freqs = np.fft.rfftfreq(n_fft, d=1.0/sr)
pitch_freqs_smoothed = pitch_freqs_smoothed[:, np.newaxis]
pitch_freqs_smoothed = np.tile(pitch_freqs_smoothed, (1, mag_frames.shape[1]))
adjusted_fft_frames = mag_frames * np.exp(1j * 2 * np.pi * pitch_freqs_smoothed * freqs[np.newaxis, :])
```
这个修改后的代码会将 `pitch_freqs_smoothed` 转换为一个形状为 `(198, 882)` 的数组,然后使用广播机制将其与 `freqs` 相乘,得到一个形状为 `(198, 882)` 的新的振幅谱矩阵 `adjusted_fft_frames`。
相关问题
结合之前几段代码,解释adjusted_fft_frames = mag_frames * np.exp(1j * 2 * np.pi * pitch_freqs_smoothed[:, np.newaxis]* freqs[np.newaxis, :])
在这段代码中,首先使用了之前处理得到的pitch_freqs_smoothed数组,这个数组包含了每一帧的基频对应的频率,然后使用了numpy的广播机制,将其转化为一个二维数组,其中每一行都是pitch_freqs_smoothed数组。另外,使用了freqs数组,这个数组包含了每一个FFT bin的分辨率,即每个bin对应的频率值,它是一个一维数组。
接下来,对于每一帧,在每个FFT bin的位置上,乘以一个复数值,这个复数值的实部是1,虚部是2π乘以相应的频率值,这个频率值是pitch_freqs_smoothed数组中对应帧的值。这个复数可以看作是一个旋转因子,它的幅度为1,相位随着频率的变化而变化,这个过程可以看作是在对FFT frame做频率变换。最后,将经过变换的FFT frame乘以mag_frames,这个mag_frames是之前计算得到的原始FFT frame的幅度信息,这样可以得到经过调整的FFT frame,其中包含了音高的信息。
``` import numpy as np from sklearn.neighbors import KNeighborsClassifier from scipy.linalg import sqrtm class JDA: def __init__(self, n_components=3, lambd=1.0): self.n_components = n_components self.lambd = lambd def fit(self, Xs, Xt, ys): ns, _ = Xs.shape nt, _ = Xt.shape Z = np.vstack((Xs, Xt)) Z_mean = np.mean(Z, axis=0) Xs_centered = Xs - np.mean(Xs, axis=0) Xt_centered = Xt - np.mean(Xt, axis=0) C_s = np.cov(Xs_centered.T) / ns C_t = np.cov(Xt_centered.T) / nt Cs_inv_sqrt = invsqrt(C_s + self.lambd * np.eye(len(Z_mean))) Ct_inv_sqrt = invsqrt(C_t + self.lambd * np.eye(len(Z_mean))) M = np.dot(Cs_inv_sqrt, Ct_inv_sqrt).T U, S, V = np.linalg.svd(M[:ns], full_matrices=False) W = np.dot(U[:, :self.n_components], V[:self.n_components]) self.Xs_new = np.dot(Xs_centered, W) self.Xr_new = np.dot(np.concatenate([Xs_centered, Xt_centered]), W) return self def transform(self, X): return np.dot(X - np.mean(X, axis=0), self.W) @staticmethod def invsqrt(matrix): u, s, v = np.linalg.svd(matrix) return np.dot(u, np.dot(np.diag(1.0 / np.sqrt(s)), v)) # 主程序入口 if __name__ == '__main__': dataset = np.load('dataset.npz') X_train_source = dataset['X_train'] X_train_target = dataset['X_val'] # 假设用验证集作为目标域 y_train_source = dataset['y_train'] jda = JDA(n_components=3, lambd=1e-6) jda.fit(X_train_source, X_train_target, y_train_source) X_train_aligned = jda.transform(X_train_source) X_val_aligned = jda.transform(X_train_target) clf = KNeighborsClassifier(n_neighbors=3) clf.fit(X_train_aligned, y_train_source) accuracy = clf.score(jda.transform(dataset['X_test']), dataset['y_test']) print(f"Accuracy on test set after JDA alignment: {accuracy:.4f}") print("Joint Distribution Alignment completed.")```Traceback (most recent call last): File "C:/Users/Lenovo/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_21.py", line 53, in <module> jda.fit(X_train_source, X_train_target, y_train_source) File "C:/Users/Lenovo/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_21.py", line 32, in fit self.Xs_new = np.dot(Xs_centered, W) File "<__array_function__ internals>", line 6, in dot ValueError: shapes (144,3000) and (144,3000) not aligned: 3000 (dim 1) != 144 (dim 0)
从错误信息来看,问题是出现在 `fit` 函数中的矩阵乘法操作:
```python
self.Xs_new = np.dot(Xs_centered, W)
```
这里出现了维度不匹配的问题 (`shapes (144,3000) and (144,3000)` 不对齐),即 `(Xs_centered.shape[1] != W.shape[0])`。
### 错误原因分析
在代码中:
1. **计算协方差矩阵**
```python
C_s = np.cov(Xs_centered.T) / ns
C_t = np.cov(Xt_centered.T) / nt
```
这里的协方差矩阵大小为 `[特征数, 特征数]`,假设原始数据有 3000 维,则这两个矩阵均为 `(3000, 3000)` 的形状。
2. **求逆平方根矩阵**
```python
Cs_inv_sqrt = invsqrt(C_s + self.lambd * np.eye(len(Z_mean)))
Ct_inv_sqrt = invsqrt(C_t + self.lambd * np.eye(len(Z_mean)))
```
结果仍是 `(3000, 3000)` 形状的矩阵。
3. **奇异值分解(SVD)**
```python
M = np.dot(Cs_inv_sqrt, Ct_inv_sqrt).T
U, S, V = np.linalg.svd(M[:ns], full_matrices=False)
```
此处提取了前 `n_components` 个主成分,并构建转换矩阵 `W`。理论上 `U`, `S`, 和 `V` 应该对应于输入矩阵的降维结果,但由于切片操作可能导致意外的结果。
4. **最终矩阵相乘**
转换后的投影矩阵 `W` 可能是 `(k, d)` 行列式(其中 k=n_components),而此时的数据中心化矩阵仍保持原特征维度 `(N,d)`,因此无法直接点乘。
---
### 解决方案
检查以下几个地方并调整代码逻辑:
#### 检查步骤一:确认 `M` 矩阵是否按预期生成
需要确保 `Cs_inv_sqrt` 和 `Ct_inv_sqrt` 相互作用后仍然保留正确的尺寸,例如通过打印调试或断言:
```python
assert Cs_inv_sqrt.shape == (Z_mean.size, Z_mean.size)
assert Ct_inv_sqrt.shape == (Z_mean.size, Z_mean.size)
```
并且明确 `np.dot()` 后转置操作的意义为何?
#### 修改二:修正 `svd` 切片及 `W` 构建过程
当前 `W` 计算涉及部分截取:
```python
W = np.dot(U[:, :self.n_components], V[:self.n_components])
```
此行可能存在问题;推荐改为标准形式:
```python
W = U[:, :self.n_components]
```
同时保证后续应用时满足条件 `dot(X,W)` 中两者的 shape 匹配规则。
---
最后完整修复版本如下所示:
```python
def fit(self, Xs, Xt, ys):
...
# Fix for matrix multiplication mismatch by proper handling dimensions.
U, _, VT = svds(M, k=self.n_components) # Only keep top-k singular values/vectors
W = U @ VT # Combine left/right vectors to form final transformation
... # Ensure correct dot product sizes throughout.
return self # Proceed normally with adjusted transformations...
```
阅读全文
相关推荐
















