What it does: Adapts DINO self-distillation to music by using CQT spectrograms, harmonic-aware positional encoding, and dual-axis attention to capture pitch, harmony, and tempo.
Links: (coming soon: code + paper)
Let Triggers Control — Frequency-aware Dropout
Status: Under Review
What it does: Introduces a frequency-aware dropout method for token control, enabling better handling of trigger tokens in generative models.
Links: (preprint link coming soon)
Illustrious — Open Advanced Illustration Model
Status: Technical Report (2024)
What it does: Large-scale illustration generation model with open release for research and creative use.