Technologies Used
Our project leverages a comprehensive stack of Python libraries and frameworks for audio processing and deep learning.
Core Technologies
Python Libraries
- os - File system operations
- numpy - Numerical computing
- librosa - Audio processing
- matplotlib - Visualization
- sklearn - Model evaluation
Deep Learning
- TensorFlow - ML framework
- Keras - High-level API
- CNN Architecture
- Custom loss functions
Implementation Details
Data Processing Pipeline
- Sample Rate: 16000 Hz
- Duration: 5 seconds per clip
- Mel Spectrograms: 128 frequency bins
- Time Steps: 109 (padded/truncated)
Model Architecture
- Input Shape: (128, 109, 1)
- Convolutional Layers: 2 layers
- Pooling Layers: 2 layers
- Dense Layers: 128 units
- Output: 2 classes (softmax)
Training Process
- Dataset: ASVspoof2019_LA_train
- Split: 80% training, 20% validation
- Batch Size: 32
- Epochs: 10
- Optimizer: Adam
File Structure
project/
├── LA/
│ ├── ASVspoof2019_LA_train/
│ │ └── flac/
│ └── ASVspoof2019_LA_cm_protocols/
│ └── ASVspoof2019.LA.cm.train.trn.txt
├── TestEvaluation/
│ └── test_eval.txt
└── audio_classifier.h5