Technologies Used
Our project leverages a comprehensive stack of Python libraries and frameworks for audio processing and deep learning.
Core Technologies
Python Libraries
- os - File system operations
- numpy - Numerical computing
- librosa - Audio processing
- matplotlib - Visualization
- sklearn - Model evaluation
Deep Learning
- TensorFlow - ML framework
- Keras - High-level API
- CNN Architecture
- Custom loss functions
Implementation Details
Data Processing Pipeline
- Sample Rate: 16000 Hz
- Duration: 5 seconds per clip
- Mel Spectrograms: 128 frequency bins
- Time Steps: 109 (padded/truncated)
Model Architecture
- Input Shape: (128, 109, 1)
- Convolutional Layers: 2 layers
- Pooling Layers: 2 layers
- Dense Layers: 128 units
- Output: 2 classes (softmax)
Training Process
- Dataset: ASVspoof2019_LA_train
- Split: 80% training, 20% validation
- Batch Size: 32
- Epochs: 10
- Optimizer: Adam
File Structure
project/ ├── LA/ │ ├── ASVspoof2019_LA_train/ │ │ └── flac/ │ └── ASVspoof2019_LA_cm_protocols/ │ └── ASVspoof2019.LA.cm.train.trn.txt ├── TestEvaluation/ │ └── test_eval.txt └── audio_classifier.h5