Back to media
media v1.0.0 1.4 min read 82 lines

songsee

오디오 파일에서 스펙트로그램 및 오디오 특징 시각화 생성 — mel, chroma, MFCC 등

community
MIT

songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.

Prerequisites

Requires Go:

go install github.com/steipete/songsee/cmd/songsee@latest

Optional: ffmpeg for formats beyond WAV/MP3.

Quick Start

# Basic spectrogram
songsee track.mp3

Save to specific file


songsee track.mp3 -o spectrogram.png

Multi-panel visualization grid


songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

Time slice (start at 12.5s, 8s duration)


songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

From stdin


cat track.mp3 | songsee - --format png -o out.png

Visualization Types

Use --viz with comma-separated values:

| Type | Description |
|------|-------------|
| spectrogram | Standard frequency spectrogram |
| mel | Mel-scaled spectrogram |
| chroma | Pitch class distribution |
| hpss | Harmonic/percussive separation |
| selfsim | Self-similarity matrix |
| loudness | Loudness over time |
| tempogram | Tempo estimation |
| mfcc | Mel-frequency cepstral coefficients |
| flux | Spectral flux (onset detection) |

Multiple --viz types render as a grid in a single image.

Common Flags

| Flag | Description |
|------|-------------|
| --viz | Visualization types (comma-separated) |
| --style | Color palette: classic, magma, inferno, viridis, gray |
| --width / --height | Output image dimensions |
| --window / --hop | FFT window and hop size |
| --min-freq / --max-freq | Frequency range filter |
| --start / --duration | Time slice of the audio |
| --format | Output format: jpg or png |
| -o | Output file path |

Notes

  • WAV and MP3 are decoded natively; other formats require ffmpeg
  • Output images can be inspected with vision_analyze for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines

Related Skills / 관련 스킬

media v1.1.0

gif-search

Tenor에서 GIF 검색/다운로드 — curl과 jq만으로 동작. 채팅용 반응 GIF에 유용.

media v1.0.0

heartmula

HeartMuLa 오픈소스 음악 생성 모델 — 가사+태그로 전체 곡 생성, 다국어 지원

youtube-content

YouTube 비디오 트랜스크립트 추출 및 콘텐츠 변환