Back to Home

Technical Details

Technologies Used

Our project leverages a comprehensive stack of Python libraries and frameworks for audio processing and deep learning.

Core Technologies

Python Libraries

  • os - File system operations
  • numpy - Numerical computing
  • librosa - Audio processing
  • matplotlib - Visualization
  • sklearn - Model evaluation

Deep Learning

  • TensorFlow - ML framework
  • Keras - High-level API
  • CNN Architecture
  • Custom loss functions

Implementation Details

Data Processing Pipeline

  • Sample Rate: 16000 Hz
  • Duration: 5 seconds per clip
  • Mel Spectrograms: 128 frequency bins
  • Time Steps: 109 (padded/truncated)

Model Architecture

  • Input Shape: (128, 109, 1)
  • Convolutional Layers: 2 layers
  • Pooling Layers: 2 layers
  • Dense Layers: 128 units
  • Output: 2 classes (softmax)

Training Process

  • Dataset: ASVspoof2019_LA_train
  • Split: 80% training, 20% validation
  • Batch Size: 32
  • Epochs: 10
  • Optimizer: Adam

File Structure

project/
├── LA/
│   ├── ASVspoof2019_LA_train/
│   │   └── flac/
│   └── ASVspoof2019_LA_cm_protocols/
│       └── ASVspoof2019.LA.cm.train.trn.txt
├── TestEvaluation/
│   └── test_eval.txt
└── audio_classifier.h5