Introduction

Determining textile composition is critical for textile recycling and degradation. However, sensors on the current market are expensive and unaffordable for most people. For instance, the frontend workers of a clothing drop box or textile recycling plant will need to first roughly categorize the textiles in order to make the recycling process easier. A lot of the time, though, the labels may be damaged or cut-off, making the workers unable to efficiently categorize the textiles. They also do not have the expensive textile sensor to do so. However, with a textile identifying software that works on the phone, the workers can easily and efficiently categorize the textiles, fastening the recycling process. Therefore, we present our approach FiberSense, using the ensemble machine learning model that combines EfficientNet-B3 and ConvNeXt-Tiny (and later ResNet50 as well) to determine the fiber composition of the cloth through a picture. Without the need of purchasing expensive sensors, our approach enables textile fibers identification for daily usage.

FiberSense identifies 10 types of textile fibers directly from an image, making fiber recognition accessible without costly sensors. It takes in an image of a cloth, and outputs the likelihood of the top 3 most likely fibers that the cloth is made of. This allows practical, everyday usage in sustainability and textile applications.

Introducing Our Database

TextileNet is a dataset specifically made to identify fiber and fabric of clothes through taxonomy. It contains 760,949 images of clothes, separated into 33 fiber labels and 27 fabric labels. TextileNet aims to create a low-cost and efficient method for identifying and categorizing textiles while also standardizing textile related datasets (Su et al.).

Observations

After running the baseline models (Resnet18), we tested the model by uploading a single image and compared it with the top three answers that the model predicted.

However we realized that for most of the identifications, cotton and polyester (the two dominant categories) are in the top three even when they are not the answer. We hypothesized that due to data imbalance, the model assumed every picture to be either polyester or cotton without truly learning to recognize distinguishing features. Polyester and cotton samples accounted for nearly half of the dataset. Thus our solution to this falseness is to balance the data by using methods like weighted sampling, weighted loss, and FocalLoss.

Color

We also tried to determine other factors that interfere with the model’s accuracy. The first factor is color. We find 4 rainbow colored clothes made of wool, cotton, nylon, and polyester fiber, and put them into the model. It turns out that color likely has an effect on the model’s prediction. The top 1 prediction for the three pictures are all polyester. While one reason may be that ¼ of the model’s training images were polyester, another possibility is that polyester clothes tend to feature multiple, bright colors.

Looking at the top 2 predictions, all of them are classified as wool except for the cotton picture, which was identified as camel. The color of the cotton picture is less saturated compared to the other 3 pictures and closely resembles most of the camel picture. Therefore, we believe it is the color of the cotton picture that influenced the model to predict it as camel.

To confirm that color does have an effect on the model’s prediction, we find another 2 pictures of the same cloth, but different colors, to see how the results differ.

The black shirt and pink shirt are both made up of cotton, which the model correctly predicts as top 1. However, the top 2 and 3 predictions and the percentages are different. This shows that color can have an effect on prediction.

Unidentifiable Objects

We then closely examined the pictures inside the dataset, and realized that there are several images in the database that are not feasible for the model. For instance, in the category “alpaca”, there are pictures of real alpaca instead of the fiber alpaca. Another example is that in the category of “yak”, there is a picture of unidentifiable food, which is not yak fiber. These unrelated data can decrease the software’s accuracy.

Skin Exposure

Plus, as noted in the paper “TextileNet A Material Taxonomy-based Fashion Textile Dataset”, high skin exposure can affect prediction results, which is why the dataset creators removed the head and feet from images. We also tested this by testing two photos of the same garment, but one with the head and the other without.

Images with heads, and therefore more skin exposure, tend to be more likely predicted as polyester, as all three of our tests show polyester as top 2 prediction. However, many images in the dataset contain over 50% skin exposure, especially in polyester and nylon fiber categories.

Unwanted categories

We also realized that having 33 categories of fiber is impractical, as many uncommon fibers have very few images. Aiming for everyday use, we choose 10 categories of fiber for our model, based on global textile fiber usage reported by Textile Exchange.

Data Processing

To better align the dataset with our software’s goal, we narrowed down to 10 fiber categories. The 10 categories are determined based on world textile fiber usage reported by Textile Exchange: Polyester, Cotton, Nylon, Viscose, Polypropylene, Acrylics, Flax, Hemp, Acetate, and Lyocell. Among ten categories, two of them (Polypropylene and Acetate) are not included in the TextileNet dataset. Moreover, as Acrylic and Lyocell categories also lacked sufficient images, we supplemented these four categories by crawling images via Google API.

Data cleansing

We then performed data cleansing on the 10 categories to maintain high quality data. We deleted pictures that are not clothes and do not relate to the category, and then filtered out pictures with over 50% skin exposure.

Fiber Category	Number of Images
Acetate	1027
Acrylics	1639
Cotton	11467
Flax	3132
Hemp	2064
Lyocell	1826
Nylon	1826
Polyester	5099
Polypropylene	961
Viscose	2726
Total	39840

Table 1. The amount of images for the 10 fiber category after data processing

Model Structure

For the model, we have the 70 epochs, which is the same as the TextileNet baseline model. In order to improve the accuracy, we implemented the ensemble model by combining EfficientNet-B3 and ConvNeXt-Tiny.

There are four main benefits of ensemble learning to our aim.

Ensemble reduces errors through averaging. When models make different types of mistakes, averaging their predictions can reduce overall error rates.
Ensemble captures complementary features. Different architectures may learn different aspects of fiber textures.
EfficientNet focuses on fine-grained patterns while ConvNeXt captures broader structural features. Combining two models enhances model diversity.
Increased robustness: if one model fails on certain fiber types, the other might compensate.

Trial 1

Results
The overall performance of Trial 1 is optimistic. The model demonstrates well convergence, with the loss drops smoothly from 2.6 to 0.8. Training remains stable, meaning that there is no severe overfitting despite the complexity of the ensemble model. The learning rate scheduling also proves effective, as cosine annealing with warm restarts performs well. The final testing accuracy is 78.59%.

Potential Issues
Though the results are good, there are still some potential issues. First of all, the weighted sampling might be too aggressive. Our weighting calculation (weight = total_samples / (num_classes * count)) can create very high weights for minority classes, potentially causing the model to see the same minority samples many times per epoch. Also, the usage of both weighted sampling and weighted loss might over-compensate. Therefore, we decided to use a different approach to data processing.

Trial 1


  class BalancedFiberDataset(Dataset):
  """Custom dataset that handles data imbalance through weighted sampling."""
                  
    def __init__(self, root, transform=None, is_valid_file=None):
        self.root = root
        self.transform = transform
        self.is_valid_file = is_valid_file
  
  
        # Load all samples
        self.samples = []
        self.classes = []
        self.class_to_idx = {}
        self._load_samples()
  
  
        # Calculate class weights for balancing
        self.class_weights = self._calculate_class_weights()
        self._print_class_distribution()
  
  
    def _load_samples(self):
        """Load all valid samples from the dataset."""
        classes = [d for d in os.listdir(self.root)
                  if os.path.isdir(os.path.join(self.root, d)) and not d.startswith('.')]
        classes.sort()
        self.classes = classes
        self.class_to_idx = {cls_name: i for i, cls_name in enumerate(classes)}
  
  
        for class_name in classes:
            class_dir = os.path.join(self.root, class_name)
            for filename in os.listdir(class_dir):
                if self.is_valid_file is None or self.is_valid_file(filename):
                    path = os.path.join(class_dir, filename)
                    item = (path, self.class_to_idx[class_name])
                    self.samples.append(item)
  
  
    def _calculate_class_weights(self):
        """Calculate inverse frequency weights for each class."""
        class_counts = Counter([label for _, label in self.samples])
        total_samples = len(self.samples)
        num_classes = len(self.classes)
  
  
        weights = []
        for i in range(num_classes):
            count = class_counts.get(i, 1)
            weight = total_samples / (num_classes * count)
            weights.append(weight)
  
  
        return weights
  
  
    def get_sample_weights(self):
        """Get weights for WeightedRandomSampler."""
        sample_weights = []
        for _, label in self.samples:
            sample_weights.append(self.class_weights[label])
        return sample_weights
  
  
    def _print_class_distribution(self):
        """Print class distribution statistics."""
        class_counts = Counter([label for _, label in self.samples])
  
  
        print("")
        print("=== DATASET STATISTICS ===")
        print(f"Total samples: {len(self.samples)}")
        print(f"Number of classes: {len(self.classes)}")
        print("")
        print("Class distribution:")
  
  
        for i, class_name in enumerate(self.classes):
            count = class_counts.get(i, 0)
            percentage = (count / len(self.samples)) * 100
            print(f"  {class_name}: {count} samples ({percentage:.1f}%)")
  
  
        # Balance metrics
        counts = list(class_counts.values())
        if counts:
            min_count, max_count = min(counts), max(counts)
            balance_ratio = min_count / max_count if max_count > 0 else 0
            print(f"")
            print(f"Balance ratio (min/max): {balance_ratio:.3f}")
            print(f"Imbalance factor: {max_count/min_count:.1f}x")
  
  
    def __len__(self):
        return len(self.samples)
  
  
    def __getitem__(self, idx):
        path, label = self.samples[idx]
        image = Image.open(path).convert('RGB')
  
  
        if self.transform:
            image = self.transform(image)
  
  
        return image, label
  
  
  def train_ensemble_model(data_root, num_epochs=25, batch_size=24, learning_rate=0.0005):
  """Train the ensemble model with data balancing."""
  
  
  print("="*80)
  print("FIBER CLASSIFICATION ENSEMBLE TRAINING")
  print("="*80)
  print(f"Data root: {data_root}")
  print(f"Epochs: {num_epochs}")
  print(f"Batch size: {batch_size}")
  print(f"Learning rate: {learning_rate}")
  print("="*80)
  
  
  # Image validation function
  def is_img(path: str) -> bool:
      ok_ext = {'.jpg', '.jpeg', '.png', '.bmp', '.tif', '.tiff', '.webp'}
      return path.lower().endswith(tuple(ok_ext))
  
  
  # Create transforms
  train_transform, val_transform = create_transforms()
  
  
  # Create balanced dataset
  dataset = BalancedFiberDataset(
      root=data_root,
      transform=train_transform,
      is_valid_file=is_img
  )
  
  
  # Create weighted sampler for balanced training
  sample_weights = dataset.get_sample_weights()
  sampler = WeightedRandomSampler(
      weights=sample_weights,
      num_samples=len(sample_weights),
      replacement=True
  )
  
  
  # Create dataloader
  train_loader = DataLoader(
      dataset,
      batch_size=batch_size,
      sampler=sampler,
      num_workers=2,
      pin_memory=True
  )
  
  
  print("")
  print(f"Training with {len(dataset)} images from {len(dataset.classes)} classes")
  print(f"Classes: {dataset.classes}")
  
  
  # Create ensemble model
  model = FiberEnsembleModel(num_classes=len(dataset.classes))
  model = model.to(device)
  
  
  # Use DataParallel if multiple GPUs available
  if torch.cuda.device_count() > 1:
      print(f"")
      print(f"Using {torch.cuda.device_count()} GPUs!")
      model = nn.DataParallel(model)
  
  
  # Loss function with class weights
  class_weights = torch.FloatTensor(dataset.class_weights).to(device)
  criterion = nn.CrossEntropyLoss(weight=class_weights)
  
  
  # Optimizer - different learning rates for pretrained and new layers
  optimizer = optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01)
  
  
  # Learning rate scheduler
  scheduler = optim.lr_scheduler.CosineAnnealingWarmRestarts(
      optimizer, T_0=7, T_mult=2, eta_min=1e-6
  )
  
  
  
  
  # Training metrics
  train_losses = []
  train_accuracies = []
  
  
  print("")
  print("Starting training...")
  
  
  # Training loop
  for epoch in range(num_epochs):
      model.train()
      running_loss = 0.0
      running_corrects = 0
      total_samples = 0
  
  
      # Progress bar
      pbar = tqdm(train_loader, desc=f'Epoch {epoch+1}/{num_epochs}')
  
  
      for batch_idx, (inputs, targets) in enumerate(pbar):
          inputs, targets = inputs.to(device), targets.to(device)
  
  
          # Zero gradients
          optimizer.zero_grad()
  
  
          # Forward pass
          final_output, eff_output, conv_output, fusion_output = model(inputs)
  
  
          # Calculate losses for ensemble training
          loss_final = criterion(final_output, targets)
          loss_eff = criterion(eff_output, targets)
          loss_conv = criterion(conv_output, targets)
          loss_fusion = criterion(fusion_output, targets)
  
  
          # Combined loss
          total_loss = loss_final + 0.3 * (loss_eff + loss_conv + loss_fusion)
  
  
          # Backward pass
          total_loss.backward()
          optimizer.step()
  
  
          # Statistics
          running_loss += total_loss.item()
          _, predicted = final_output.max(1)
          running_corrects += predicted.eq(targets).sum().item()
          total_samples += targets.size(0)
  
  
          # Update progress bar
          current_acc = 100. * running_corrects / total_samples
          pbar.set_postfix({
              'Loss': f'{running_loss/(batch_idx+1):.3f}',
              'Acc': f'{current_acc:.2f}%',
              'LR': f'{optimizer.param_groups[0]["lr"]:.2e}'
          })
  
  
      # Epoch statistics
      epoch_loss = running_loss / len(train_loader)
      epoch_acc = 100. * running_corrects / total_samples
  
  
      train_losses.append(epoch_loss)
      train_accuracies.append(epoch_acc)
  
  
      print(f'')
      print(f'Epoch {epoch+1}: Loss = {epoch_loss:.4f}, Accuracy = {epoch_acc:.2f}%')
  
  
      # Step scheduler
      scheduler.step()
  
  
      # Save checkpoint every 5 epochs
      if (epoch + 1) % 5 == 0:
          checkpoint = {
              'epoch': epoch,
              'model_state_dict': model.state_dict(),
              'optimizer_state_dict': optimizer.state_dict(),
              'loss': epoch_loss,
              'accuracy': epoch_acc,
              'classes': dataset.classes,
              'class_weights': dataset.class_weights
          }
          checkpoint_path = f'/content/drive/MyDrive/ML/Fibre/ensemble_checkpoint_epoch_{epoch+1}.pth'
          torch.save(checkpoint, checkpoint_path)
          print(f'Checkpoint saved: {checkpoint_path}')
  
  
  print("")
  print("="*80)
  print("TRAINING COMPLETED!")
  print(f"Final Accuracy: {train_accuracies[-1]:.2f}%")
  print("="*80)
  
  
  # Plot training curves
  plt.figure(figsize=(15, 5))
  
  
  plt.subplot(1, 3, 1)
  plt.plot(train_losses, 'b-', linewidth=2)
  plt.title('Training Loss', fontsize=14)
  plt.xlabel('Epoch')
  plt.ylabel('Loss')
  plt.grid(True)
  
  
  plt.subplot(1, 3, 2)
  plt.plot(train_accuracies, 'g-', linewidth=2)
  plt.title('Training Accuracy', fontsize=14)
  plt.xlabel('Epoch')
  plt.ylabel('Accuracy (%)')
  plt.grid(True)
  
  
  plt.subplot(1, 3, 3)
  plt.bar(range(len(dataset.classes)),
          [len([s for s in dataset.samples if s[1] == i]) for i in range(len(dataset.classes))])
  plt.title('Class Distribution', fontsize=14)
  plt.xlabel('Class Index')
  plt.ylabel('Number of Samples')
  plt.xticks(range(len(dataset.classes)), dataset.classes, rotation=45)
  plt.grid(True)
  
  
  plt.tight_layout()
  plt.show()
  
  
  return model, dataset.classes, dataset.class_weights

Trial 2

While searching for different approaches to process our dataset, we found Focal Loss, a conventional way of data processing. To prevent further imbalance of our dataset, we also perform stratified splits. We partition our data to 0.68, 0.17, and 0.15 for training, validation, and testing. Thus, for trial 2, we use Focal Loss along with stratified split instead of weighted sampling and weighted loss.

Trial 2


  class FocalLoss(nn.Module):
    def __init__(self, alpha=1, gamma=2):
        super().__init__()
        self.alpha = alpha
        self.gamma = gamma
  
  
    def forward(self, inputs, targets):
        ce_loss = F.cross_entropy(inputs, targets, reduction='none')
        pt = torch.exp(-ce_loss)
        focal_loss = self.alpha * (1-pt)**self.gamma * ce_loss
        return focal_loss.mean()
  
  
  def create_stratified_split(dataset, test_size=0.2):
  # Get labels
  labels = [sample[1] for sample in dataset.samples]
  indices = list(range(len(labels)))
  
  
  train_idx, val_idx = train_test_split(
      indices,
      test_size=test_size,
      stratify=labels,
      random_state=42
  )
  
  
  return train_idx, val_idx
  
  
  def train_ensemble_model_with_focal_loss(data_root, num_epochs=25, batch_size=24, learning_rate=0.0005,
                                          focal_alpha=1.0, focal_gamma=2.0, validation_split=0.2):
      """Train the ensemble model with Focal Loss and validation tracking."""
  
  
      print("="*80)
      print("FIBER CLASSIFICATION ENSEMBLE TRAINING (with Focal Loss)")
      print("="*80)
      print(f"Data root: {data_root}")
      print(f"Epochs: {num_epochs}")
      print(f"Batch size: {batch_size}")
      print(f"Learning rate: {learning_rate}")
      print(f"Focal Loss - Alpha: {focal_alpha}, Gamma: {focal_gamma}")
      print(f"Validation split: {validation_split}")
      print("="*80)
  
  
      # Image validation function
      def is_img(path: str) -> bool:
          ok_ext = {'.jpg', '.jpeg', '.png', '.bmp', '.tif', '.tiff', '.webp'}
          return path.lower().endswith(tuple(ok_ext))
  
  
      # Create transforms
      train_transform, val_transform = create_transforms()
  
  
      # Create full dataset
      full_dataset = BalancedFiberDataset(
          root=data_root,
          transform=None,  # We'll apply transforms later
          is_valid_file=is_img
      )
  
  
      # Create stratified train/validation split
      from sklearn.model_selection import train_test_split
      labels = [sample[1] for sample in full_dataset.samples]
      indices = list(range(len(labels)))
  
  
      train_idx, val_idx = train_test_split(
          indices,
          test_size=validation_split,
          stratify=labels,
          random_state=42
      )
  
  
      # Create separate datasets for train and validation
      train_samples = [full_dataset.samples[i] for i in train_idx]
      val_samples = [full_dataset.samples[i] for i in val_idx]
  
  
      # Custom dataset classes for train/val
      class CustomDataset(Dataset):
          def __init__(self, samples, classes, class_to_idx, transform):
              self.samples = samples
              self.classes = classes
              self.class_to_idx = class_to_idx
              self.transform = transform
  
  
          def __len__(self):
              return len(self.samples)
  
  
          def __getitem__(self, idx):
              path, label = self.samples[idx]
              image = Image.open(path).convert('RGB')
              if self.transform:
                  image = self.transform(image)
              return image, label
  
  
      train_dataset = CustomDataset(train_samples, full_dataset.classes,
                                   full_dataset.class_to_idx, train_transform)
      val_dataset = CustomDataset(val_samples, full_dataset.classes,
                                 full_dataset.class_to_idx, val_transform)
  
  
      # Create weighted sampler for training data only
      train_class_counts = Counter([label for _, label in train_samples])
      train_sample_weights = []
      for _, label in train_samples:
          weight = len(train_samples) / (len(full_dataset.classes) * train_class_counts[label])
          train_sample_weights.append(weight)
  
  
  
  
      # Create dataloaders
      train_loader = DataLoader(
          train_dataset,
          batch_size=batch_size,
          shuffle=True,  # Just use regular shuffling
          num_workers=2,
          pin_memory=True
      )
  
  
      val_loader = DataLoader(
          val_dataset,
          batch_size=batch_size,
          shuffle=False,
          num_workers=2,
          pin_memory=True
      )
  
  
      print(f"Training with {len(train_dataset)} images, Validation with {len(val_dataset)} images")
      print(f"Classes: {full_dataset.classes}")
  
  
      # Create ensemble model
      model = FiberEnsembleModel(num_classes=len(full_dataset.classes))
      model = model.to(device)
  
  
      # Use DataParallel if multiple GPUs available
      if torch.cuda.device_count() > 1:
          print(f"Using {torch.cuda.device_count()} GPUs!")
          model = nn.DataParallel(model)
  
  
      # Focal Loss instead of weighted CrossEntropy
      criterion = FocalLoss(alpha=focal_alpha, gamma=focal_gamma)
  
  
      # Optimizer
      optimizer = optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01)
  
  
      # Learning rate scheduler
      scheduler = optim.lr_scheduler.CosineAnnealingWarmRestarts(
          optimizer, T_0=7, T_mult=2, eta_min=1e-6
      )
  
  
      # Training metrics
      train_losses = []
      train_accuracies = []
      val_losses = []
      val_accuracies = []
      best_val_acc = 0.0
  
  
      print("")
      print("Starting training...")
  
  
      # Training loop
      for epoch in range(num_epochs):
          # Training phase
          model.train()
          running_loss = 0.0
          running_corrects = 0
          total_samples = 0
  
  
          pbar = tqdm(train_loader, desc=f'Epoch {epoch+1}/{num_epochs} [TRAIN]')
  
  
          for batch_idx, (inputs, targets) in enumerate(pbar):
              inputs, targets = inputs.to(device), targets.to(device)
  
  
              optimizer.zero_grad()
  
  
              # Forward pass
              final_output, eff_output, conv_output, fusion_output = model(inputs)
  
  
              # Calculate losses using Focal Loss
              loss_final = criterion(final_output, targets)
              loss_eff = criterion(eff_output, targets)
              loss_conv = criterion(conv_output, targets)
              loss_fusion = criterion(fusion_output, targets)
  
  
              # Combined loss
              total_loss = loss_final + 0.3 * (loss_eff + loss_conv + loss_fusion)
  
  
              # Backward pass
              total_loss.backward()
              optimizer.step()
  
  
              # Statistics
              running_loss += total_loss.item()
              _, predicted = final_output.max(1)
              running_corrects += predicted.eq(targets).sum().item()
              total_samples += targets.size(0)
  
  
              # Update progress bar
              current_acc = 100. * running_corrects / total_samples
              pbar.set_postfix({
                  'Loss': f'{running_loss/(batch_idx+1):.3f}',
                  'Acc': f'{current_acc:.2f}%',
                  'LR': f'{optimizer.param_groups[0]["lr"]:.2e}'
              })
  
  
          # Training epoch statistics
          epoch_train_loss = running_loss / len(train_loader)
          epoch_train_acc = 100. * running_corrects / total_samples
  
  
          train_losses.append(epoch_train_loss)
          train_accuracies.append(epoch_train_acc)
  
  
          # Validation phase
          model.eval()
          val_running_loss = 0.0
          val_running_corrects = 0
          val_total_samples = 0
          all_val_preds = []
          all_val_targets = []
  
  
          with torch.no_grad():
              val_pbar = tqdm(val_loader, desc=f'Epoch {epoch+1}/{num_epochs} [VAL]')
              for inputs, targets in val_pbar:
                  inputs, targets = inputs.to(device), targets.to(device)
  
  
                  final_output, eff_output, conv_output, fusion_output = model(inputs)
  
  
                  # Calculate validation loss
                  loss_final = criterion(final_output, targets)
                  loss_eff = criterion(eff_output, targets)
                  loss_conv = criterion(conv_output, targets)
                  loss_fusion = criterion(fusion_output, targets)
                  total_loss = loss_final + 0.3 * (loss_eff + loss_conv + loss_fusion)
  
  
                  val_running_loss += total_loss.item()
                  _, predicted = final_output.max(1)
                  val_running_corrects += predicted.eq(targets).sum().item()
                  val_total_samples += targets.size(0)
  
  
                  # Store predictions for detailed analysis
                  all_val_preds.extend(predicted.cpu().numpy())
                  all_val_targets.extend(targets.cpu().numpy())
  
  
                  val_current_acc = 100. * val_running_corrects / val_total_samples
                  val_pbar.set_postfix({
                      'Val Loss': f'{val_running_loss/(len(all_val_preds)//batch_size+1):.3f}',
                      'Val Acc': f'{val_current_acc:.2f}%'
                  })
  
  
          # Validation epoch statistics
          epoch_val_loss = val_running_loss / len(val_loader)
          epoch_val_acc = 100. * val_running_corrects / val_total_samples
  
  
          val_losses.append(epoch_val_loss)
          val_accuracies.append(epoch_val_acc)
  
  
          print(f'Epoch {epoch+1}:')
          print(f'  Train - Loss: {epoch_train_loss:.4f}, Acc: {epoch_train_acc:.2f}%')
          print(f'  Val   - Loss: {epoch_val_loss:.4f}, Acc: {epoch_val_acc:.2f}%')
  
  
          # Per-class validation metrics every 10 epochs
          if (epoch + 1) % 10 == 0:
              print("\nPer-class Validation Metrics:")
              from sklearn.metrics import classification_report
              report = classification_report(
                  all_val_targets,
                  all_val_preds,
                  target_names=full_dataset.classes,
                  output_dict=True,
                  zero_division=0
              )
  
  
              for class_name in full_dataset.classes:
                  if class_name in report:
                      precision = report[class_name]['precision']
                      recall = report[class_name]['recall']
                      f1 = report[class_name]['f1-score']
                      support = report[class_name]['support']
                      print(f"  {class_name}: P={precision:.3f}, R={recall:.3f}, F1={f1:.3f}, Support={support}")
  
  
          # Step scheduler
          scheduler.step()
  
  
          # Save best model
          if epoch_val_acc > best_val_acc:
              best_val_acc = epoch_val_acc
              best_checkpoint = {
                  'epoch': epoch,
                  'model_state_dict': model.state_dict(),
                  'optimizer_state_dict': optimizer.state_dict(),
                  'train_loss': epoch_train_loss,
                  'val_loss': epoch_val_loss,
                  'train_accuracy': epoch_train_acc,
                  'val_accuracy': epoch_val_acc,
                  'classes': full_dataset.classes,
                  'all_val_preds': all_val_preds,
                  'all_val_targets': all_val_targets
              }
              best_model_path = f'/content/drive/MyDrive/ML/Fibre/best_ensemble_model.pth'
              torch.save(best_checkpoint, best_model_path)
  
  
          # Save checkpoint every 10 epochs
          if (epoch + 1) % 10 == 0:
              checkpoint = {
                  'epoch': epoch,
                  'model_state_dict': model.state_dict(),
                  'optimizer_state_dict': optimizer.state_dict(),
                  'train_loss': epoch_train_loss,
                  'val_loss': epoch_val_loss,
                  'train_accuracy': epoch_train_acc,
                  'val_accuracy': epoch_val_acc,
                  'classes': full_dataset.classes,
              }
              checkpoint_path = f'/content/drive/MyDrive/ML/Fibre/ensemble_checkpoint_epoch_{epoch+1}.pth'
              torch.save(checkpoint, checkpoint_path)
              print(f'Checkpoint saved: {checkpoint_path}')
  
  
      print("")
      print("="*80)
      print("TRAINING COMPLETED!")
      print(f"Best Validation Accuracy: {best_val_acc:.2f}%")
      print(f"Final Train Accuracy: {train_accuracies[-1]:.2f}%")
      print(f"Final Validation Accuracy: {val_accuracies[-1]:.2f}%")
      print("="*80)
  
  
      # Plot comprehensive training curves
      plt.figure(figsize=(20, 10))
  
  
      # Loss curves
      plt.subplot(2, 4, 1)
      plt.plot(train_losses, 'b-', linewidth=2, label='Train')
      plt.plot(val_losses, 'r-', linewidth=2, label='Validation')
      plt.title('Training & Validation Loss', fontsize=14)
      plt.xlabel('Epoch')
      plt.ylabel('Loss')
      plt.legend()
      plt.grid(True)
  
  
      # Accuracy curves
      plt.subplot(2, 4, 2)
      plt.plot(train_accuracies, 'b-', linewidth=2, label='Train')
      plt.plot(val_accuracies, 'r-', linewidth=2, label='Validation')
      plt.title('Training & Validation Accuracy', fontsize=14)
      plt.xlabel('Epoch')
      plt.ylabel('Accuracy (%)')
      plt.legend()
      plt.grid(True)
  
  
      # Class distribution
      plt.subplot(2, 4, 3)
      class_counts = Counter([label for _, label in full_dataset.samples])
      plt.bar(range(len(full_dataset.classes)),
              [class_counts[i] for i in range(len(full_dataset.classes))])
      plt.title('Total Class Distribution', fontsize=14)
      plt.xlabel('Class Index')
      plt.ylabel('Number of Samples')
      plt.xticks(range(len(full_dataset.classes)), full_dataset.classes, rotation=45)
      plt.grid(True)
  
  
      # Train vs Val distribution
      plt.subplot(2, 4, 4)
      train_counts = Counter([label for _, label in train_samples])
      val_counts = Counter([label for _, label in val_samples])
  
  
      x = np.arange(len(full_dataset.classes))
      width = 0.35
  
  
      plt.bar(x - width/2, [train_counts[i] for i in range(len(full_dataset.classes))],
              width, label='Train', alpha=0.8)
      plt.bar(x + width/2, [val_counts[i] for i in range(len(full_dataset.classes))],
              width, label='Validation', alpha=0.8)
  
  
      plt.title('Train/Val Split Distribution', fontsize=14)
      plt.xlabel('Class Index')
      plt.ylabel('Number of Samples')
      plt.xticks(x, full_dataset.classes, rotation=45)
      plt.legend()
      plt.grid(True)
  
  
      # Final confusion matrix
      plt.subplot(2, 4, (5, 8))
      from sklearn.metrics import confusion_matrix
      import seaborn as sns
  
  
      cm = confusion_matrix(best_checkpoint['all_val_targets'], best_checkpoint['all_val_preds'])
  
  
      plt.figure(figsize=(10, 8))
      sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                  xticklabels=full_dataset.classes,
                  yticklabels=full_dataset.classes)
      plt.title('Validation Confusion Matrix (Best Epoch)', fontsize=14)
      plt.xlabel('Predicted')
      plt.ylabel('Actual')
      plt.xticks(rotation=45)
      plt.yticks(rotation=0)
  
  
      plt.tight_layout()
      plt.show()
  
  
      return model, full_dataset.classes, best_val_acc

Results:

	Trial 1	Trial 2	Difference
Test Accuracy (top 1)	78.59%	61.94%	-16.65%
Balanced Accuracy	91.20%	59.70%	-31.50%
F1 Macro	85.43%	60.36%	-25.07%

Table 2. result comparison between trial 1 and trial 2

Trial 2 shows lower accuracy but demonstrates greater stability compared to Trial 1. In Trial 1, the training accuracy reached over 90%, but the testing accuracy dropped to 78.59%, showing great instability. On the other hand, Trial 2 achieved the test accuracy of 61.94%, which is close to the validation accuracy of 63%. There is also less severe overfitting in Trial 2 as the gap reduced from 32% to ~30%.

Challenges

There are three main reasons which make fiber classification difficult. First, visual textures between different fibers are similar, making it hard for the model to identify. Another reason is that lighting/color varies across data, and this can be more prominent than fiber structure. Lastly, class imbalance in real-world datasets causes difficulty and inefficiency in training, as it causes differences in weighting while training.

Analyzing the Data

Trial 1 verses Trial 2

Trial 1 outperformed Trial 2 for several reasons. First, it is better at handling class imbalance. The balanced accuracy of Trial 1 is 91.20%, which means that it is excellent at handling minority classes, while the balanced accuracy of Trial 2 is 59.70%, meaning it has poor performance on minority classes. Another reason is that trial 1 achieved much higher recall (ability to find minority class samples), which attributes to unified data distribution.

Class	Trial 1	Trial 2
Acetate	99.4%	63.8%
Hemp	99.4%	43.3%
Flax	98.3%	48.3%
Lyocell	99.7%	56.2%

Table 3. Recall percentage of trial 1 and trial 2 between different classes

The third reason is that trial maintained high F1-scores across the board, while trial 2 struggled with most classes. By analyzing the data, we concluded that trial 1 performs better because WeightedRandomSampler was actually effective in our dataset, and Weighted CrossEntropy Loss worked well with the sampling strategy. The combination created strong minority class learning without sacrificing majority class performance. In contrast, Focal Loss may have been too aggressive for our data. Removing weighted sampling hurts the minority class performance. Though we added stratified validation for trial 2, it did not compensate for the absence of WeightedRandomSampler.

Conclusion

To sum up, trial 1's success on the test set suggests that despite not having validation monitoring, the aggressive weighted sampling forced the model to learn robust features for minority classes. The weighted loss prevented it from just memorizing majority classes, and the ensemble architecture provided enough regularization to generalize reasonably well.

In conclusion, in order to balance our fiber dataset (10.6x imbalance factor), aggressive balancing techniques are required. Weighted sampling plus weighted loss works better than focal loss. Data balancing in trial one created a model that learned to distinguish all fiber types effectively. Therefore, in Trial 3, we retained the overall structure of Trial 1 while making further refinements to develop the final model.

Final Model

Trial 3

To further improve the accuracy and decrease aggressiveness, we perform a series of modifications on trial 1.

We added the same stratified split that we used for trial 2, which partitioned the data into 0.68, 0.17, and 0.15 for training, validation, and testing. This can prevent the data to randomly split when testing, preventing further imbalancing.
We reduce trial 1's problem of overfitting through stronger regularization. This includes increasing dropout rates and adding weight decay to the optimizer.
We improve data argumentation by adding more texture-specific argumentation, aiming to increase accuracy.
We optimized weighted sampling by applying smoothing to reduce extreme weights.
We apply learning rate scheduling, which dynamically adjusts the learning rate over time. This helps to speed up the learning process, creates more stable fine-tuning later, further avoids overfitting, and improves generalization.
We ensemble ResNet50 (both TextileNet's and our baseline uses ResNet18) as the third model to increase ensemble variability. ResNet50 has a strong texture bias, which means that it focuses on specific patterns rather than global shapes. Also, ConvNeXt-Tiny's architecture design is greatly different from ResNet50's, meaning that they can complement each other's mistakes.

Trial 3


  def get_balanced_weights_improved(class_counts, smoothing=0.7):
  """Less aggressive balancing to reduce overfitting on minority classes."""
  total = sum(class_counts.values())
  weights = []
  
  
  for i in range(len(class_counts)):
      count = class_counts.get(i, 1)
      # Apply smoothing to reduce extreme weights
      raw_weight = total / (len(class_counts) * count)
      smoothed_weight = raw_weight ** smoothing  # Reduce extreme weighting
      weights.append(smoothed_weight)
  
  
  class FiberEnsembleModelImproved(nn.Module):
  """Improved ensemble with enhanced regularization."""
  
  
  def __init__(self, num_classes, dropout_rate=0.5):
      super(FiberEnsembleModelImproved, self).__init__()
      self.num_classes = num_classes
  
  
      # Model 1: EfficientNet-B3
      print("Loading EfficientNet-B3...")
      self.efficientnet = timm.create_model('efficientnet_b3', pretrained=True)
      eff_features = self.efficientnet.classifier.in_features  # GET features here
      self.efficientnet.classifier = nn.Identity()
  
  
      # Model 2: ConvNeXt-Tiny
      print("Loading ConvNeXt-Tiny...")
      self.convnext = timm.create_model('convnext_tiny', pretrained=True)
      conv_features = self.convnext.head.fc.in_features  # GET features here
      self.convnext.head.fc = nn.Identity()
  
  
      # NOW you can use eff_features and conv_features
      self.eff_classifier = nn.Sequential(
          nn.Dropout(dropout_rate),
          nn.Linear(eff_features, 512),
          nn.ReLU(inplace=True),
          nn.BatchNorm1d(512),
          nn.Dropout(dropout_rate * 0.7),
          nn.Linear(512, num_classes)
      )
  
  
      self.conv_classifier = nn.Sequential(
          nn.Dropout(dropout_rate),
          nn.Linear(conv_features, 512),
          nn.ReLU(inplace=True),
          nn.BatchNorm1d(512),
          nn.Dropout(dropout_rate * 0.7),
          nn.Linear(512, num_classes)
      )
  
  
      # Fusion classifier
      total_features = eff_features + conv_features
      self.fusion_classifier = nn.Sequential(
          nn.Dropout(dropout_rate + 0.1),  # Higher dropout for fusion
          nn.Linear(total_features, 1024),
          nn.BatchNorm1d(1024),
          nn.ReLU(inplace=True),
          nn.Dropout(dropout_rate * 0.8),
          nn.Linear(1024, 512),
          nn.BatchNorm1d(512),
          nn.ReLU(inplace=True),
          nn.Dropout(dropout_rate * 0.6),
          nn.Linear(512, num_classes)
      )
  
  
      print(f"Improved ensemble model created with {num_classes} classes")
      print(f"Dropout rate: {dropout_rate}")
  
  
  def forward(self, x):
      # Extract features
      eff_features = self.efficientnet(x)
      conv_features = self.convnext(x)
  
  
      # Individual predictions
      eff_output = self.eff_classifier(eff_features)
      conv_output = self.conv_classifier(conv_features)
  
  
      # Fusion prediction
      combined_features = torch.cat([eff_features, conv_features], dim=1)
      fusion_output = self.fusion_classifier(combined_features)
  
  
      # Weighted ensemble (same as before)
      final_output = 0.3 * eff_output + 0.3 * conv_output + 0.4 * fusion_output
  
  
      return final_output, eff_output, conv_output, fusion_output            
              return weights
  
  
  class FiberTripleEnsemble(nn.Module):
  """Add a third model for more diversity."""
  def __init__(self, num_classes):
      super().__init__()
  
  
      # Add ResNet as third model
      self.resnet = timm.create_model('resnet50', pretrained=True)
      resnet_features = self.resnet.fc.in_features
      self.resnet.fc = nn.Identity()
  
  
      self.resnet_classifier = nn.Sequential(
          nn.Dropout(0.4),
          nn.Linear(resnet_features, 512),
          nn.ReLU(),
          nn.Dropout(0.3),
          nn.Linear(512, num_classes)
      )
  
  
      # Update fusion layer
      total_features = eff_features + conv_features + resnet_features
      # ... rest of ensemble code
  
  
  def create_transforms_improved():
  """Create improved training transforms with enhanced texture augmentations."""
  
  
  train_transform = transforms.Compose([
      transforms.Resize((256, 256)),
      transforms.RandomResizedCrop(224, scale=(0.7, 1.0)),
      transforms.RandomHorizontalFlip(p=0.5),
      transforms.RandomVerticalFlip(p=0.3),
      transforms.RandomRotation(degrees=30),
  
  
      # Enhanced color and texture augmentations
      transforms.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.2, hue=0.1),
      transforms.RandomAdjustSharpness(sharpness_factor=2, p=0.3),
      transforms.RandomAutocontrast(p=0.3),
  
  
      # Fix GaussianBlur - wrap it in RandomApply for probability control
      transforms.RandomApply([transforms.GaussianBlur(kernel_size=3)], p=0.2),
  
  
      transforms.ToTensor(),
      transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
      transforms.RandomErasing(p=0.2, scale=(0.02, 0.2))
  ])
  
  
  val_transform = transforms.Compose([
      transforms.Resize((224, 224)),
      transforms.ToTensor(),
      transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
  ])
  
  
  return train_transform, val_transform
  
  
  def train_ensemble_model_improved(data_root, num_epochs=70, batch_size=24, learning_rate=0.0005,
                                     dropout_rate=0.5, weight_decay=0.05, validation_split=0.15,
                                     early_stopping_patience=10, smoothing_factor=0.7):
      """Trial 3: Improved version of Trial 1 with validation monitoring and regularization."""
  
  
      print("="*80)
      print("TRIAL 3: IMPROVED WEIGHTED SAMPLING TRAINING")
      print("="*80)
      print(f"Data root: {data_root}")
      print(f"Max epochs: {num_epochs}")
      print(f"Batch size: {batch_size}")
      print(f"Learning rate: {learning_rate}")
      print(f"Dropout rate: {dropout_rate}")
      print(f"Weight decay: {weight_decay}")
      print(f"Validation split: {validation_split}")
      print(f"Early stopping patience: {early_stopping_patience}")
      print("="*80)
  
  
      # Image validation function
      def is_img(path: str) -> bool:
          ok_ext = {'.jpg', '.jpeg', '.png', '.bmp', '.tif', '.tiff', '.webp'}
          return path.lower().endswith(tuple(ok_ext))
  
  
      # Create improved transforms
      train_transform, val_transform = create_transforms_improved()
  
  
      # Create full dataset
      full_dataset = BalancedFiberDataset(
          root=data_root,
          transform=None,
          is_valid_file=is_img
      )
  
  
      # Create stratified train/val split
      from sklearn.model_selection import train_test_split
      labels = [sample[1] for sample in full_dataset.samples]
      indices = list(range(len(labels)))
  
  
      train_idx, val_idx = train_test_split(
          indices, test_size=validation_split, stratify=labels, random_state=42
      )
  
  
      # Create train and validation samples
      train_samples = [full_dataset.samples[i] for i in train_idx]
      val_samples = [full_dataset.samples[i] for i in val_idx]
  
  
      # Custom dataset classes
      class CustomDataset(Dataset):
          def __init__(self, samples, classes, class_to_idx, transform):
              self.samples = samples
              self.classes = classes
              self.class_to_idx = class_to_idx
              self.transform = transform
  
  
          def __len__(self):
              return len(self.samples)
  
  
          def __getitem__(self, idx):
              path, label = self.samples[idx]
              image = Image.open(path).convert('RGB')
              if self.transform:
                  image = self.transform(image)
              return image, label
  
  
      train_dataset = CustomDataset(train_samples, full_dataset.classes,
                                   full_dataset.class_to_idx, train_transform)
      val_dataset = CustomDataset(val_samples, full_dataset.classes,
                                 full_dataset.class_to_idx, val_transform)
  
  
      # Create improved weighted sampler with smoothing
      def get_balanced_weights_improved(samples, smoothing=0.7):
          class_counts = Counter([label for _, label in samples])
          total = len(samples)
          num_classes = len(full_dataset.classes)
  
  
          sample_weights = []
          for _, label in samples:
              count = class_counts[label]
              raw_weight = total / (num_classes * count)
              smoothed_weight = raw_weight ** smoothing  # Apply smoothing
              sample_weights.append(smoothed_weight)
  
  
          return sample_weights
  
  
      train_sample_weights = get_balanced_weights_improved(train_samples, smoothing_factor)
      sampler = WeightedRandomSampler(
          weights=train_sample_weights,
          num_samples=len(train_sample_weights),
          replacement=True
      )
  
  
      # Create dataloaders
      train_loader = DataLoader(
          train_dataset,
          batch_size=batch_size,
          sampler=sampler,
          num_workers=2,
          pin_memory=True
      )
  
  
      val_loader = DataLoader(
          val_dataset,
          batch_size=batch_size,
          shuffle=False,
          num_workers=2,
          pin_memory=True
      )
  
  
      print(f"Training with {len(train_dataset)} images, Validation with {len(val_dataset)} images")
      print(f"Classes: {full_dataset.classes}")
  
  
      # Create improved ensemble model
      model = FiberEnsembleModelImproved(num_classes=len(full_dataset.classes), dropout_rate=dropout_rate)
      model = model.to(device)
  
  
      # Use DataParallel if multiple GPUs available
      if torch.cuda.device_count() > 1:
          print(f"Using {torch.cuda.device_count()} GPUs!")
          model = nn.DataParallel(model)
  
  
      # Loss function with class weights (same as Trial 1)
      class_weights = torch.FloatTensor(full_dataset.class_weights).to(device)
      criterion = nn.CrossEntropyLoss(weight=class_weights)
  
  
      # Optimizer with increased weight decay
      optimizer = optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
  
  
      # NEW: ReduceLROnPlateau scheduler
      scheduler = optim.lr_scheduler.ReduceLROnPlateau(
          optimizer,
          mode='max',      # Monitor validation accuracy
          factor=0.5,      # Reduce LR by half
          patience=5,      # Wait 5 epochs before reducing
      )
  
  
      # Training metrics
      train_losses = []
      train_accuracies = []
      val_losses = []
      val_accuracies = []
  
  
      # Early stopping variables
      best_val_acc = 0
      patience_counter = 0
      best_epoch = 0
  
  
      print("")
      print("Starting improved training...")
  
  
      # Training loop
      for epoch in range(num_epochs):
          # Training phase
          model.train()
          running_loss = 0.0
          running_corrects = 0
          total_samples = 0
  
  
          pbar = tqdm(train_loader, desc=f'Epoch {epoch+1}/{num_epochs} [TRAIN]')
  
  
          for batch_idx, (inputs, targets) in enumerate(pbar):
              inputs, targets = inputs.to(device), targets.to(device)
  
  
              optimizer.zero_grad()
  
  
              # Forward pass
              final_output, eff_output, conv_output, fusion_output = model(inputs)
  
  
              # Calculate losses (same as Trial 1)
              loss_final = criterion(final_output, targets)
              loss_eff = criterion(eff_output, targets)
              loss_conv = criterion(conv_output, targets)
              loss_fusion = criterion(fusion_output, targets)
  
  
              # Combined loss
              total_loss = loss_final + 0.3 * (loss_eff + loss_conv + loss_fusion)
  
  
              # Backward pass
              total_loss.backward()
              optimizer.step()
  
  
              # Statistics
              running_loss += total_loss.item()
              _, predicted = final_output.max(1)
              running_corrects += predicted.eq(targets).sum().item()
              total_samples += targets.size(0)
  
  
              # Update progress bar
              current_acc = 100. * running_corrects / total_samples
              pbar.set_postfix({
                  'Loss': f'{running_loss/(batch_idx+1):.3f}',
                  'Acc': f'{current_acc:.2f}%',
                  'LR': f'{optimizer.param_groups[0]["lr"]:.2e}'
              })
  
  
          # Training epoch statistics
          epoch_train_loss = running_loss / len(train_loader)
          epoch_train_acc = 100. * running_corrects / total_samples
  
  
          train_losses.append(epoch_train_loss)
          train_accuracies.append(epoch_train_acc)
  
  
          # Validation phase
          model.eval()
          val_running_loss = 0.0
          val_running_corrects = 0
          val_total_samples = 0
  
  
          with torch.no_grad():
              val_pbar = tqdm(val_loader, desc=f'Epoch {epoch+1}/{num_epochs} [VAL]')
              for inputs, targets in val_pbar:
                  inputs, targets = inputs.to(device), targets.to(device)
  
  
                  final_output, eff_output, conv_output, fusion_output = model(inputs)
  
  
                  # Calculate validation loss
                  loss_final = criterion(final_output, targets)
                  loss_eff = criterion(eff_output, targets)
                  loss_conv = criterion(conv_output, targets)
                  loss_fusion = criterion(fusion_output, targets)
                  total_loss = loss_final + 0.3 * (loss_eff + loss_conv + loss_fusion)
  
  
                  val_running_loss += total_loss.item()
                  _, predicted = final_output.max(1)
                  val_running_corrects += predicted.eq(targets).sum().item()
                  val_total_samples += targets.size(0)
  
  
                  val_current_acc = 100. * val_running_corrects / val_total_samples
                  val_pbar.set_postfix({
                      'Val Loss': f'{val_running_loss/(val_total_samples//batch_size+1):.3f}',
                      'Val Acc': f'{val_current_acc:.2f}%'
                  })
  
  
          # Validation epoch statistics
          epoch_val_loss = val_running_loss / len(val_loader)
          epoch_val_acc = 100. * val_running_corrects / val_total_samples
  
  
          val_losses.append(epoch_val_loss)
          val_accuracies.append(epoch_val_acc)
  
  
          print(f'Epoch {epoch+1}:')
          print(f'  Train - Loss: {epoch_train_loss:.4f}, Acc: {epoch_train_acc:.2f}%')
          print(f'  Val   - Loss: {epoch_val_loss:.4f}, Acc: {epoch_val_acc:.2f}%')
  
  
          # Step scheduler with validation accuracy
          scheduler.step(epoch_val_acc)
  
  
          # monitor the change of lr
          old_lr = optimizer.param_groups[0]["lr"]
          scheduler.step(epoch_val_acc)
          new_lr = optimizer.param_groups[0]["lr"]
  
  
          if new_lr != old_lr:
              print(f"Learning rate reduced from {old_lr:.2e} to {new_lr:.2e}")
  
  
          # Early stopping logic
          if epoch_val_acc > best_val_acc:
              best_val_acc = epoch_val_acc
              patience_counter = 0
              best_epoch = epoch
  
  
              # Save best model
              checkpoint = {
                  'epoch': epoch,
                  'model_state_dict': model.state_dict(),
                  'optimizer_state_dict': optimizer.state_dict(),
                  'train_loss': epoch_train_loss,
                  'val_loss': epoch_val_loss,
                  'train_accuracy': epoch_train_acc,
                  'val_accuracy': epoch_val_acc,
                  'classes': full_dataset.classes,
                  'best_val_acc': best_val_acc
              }
              best_model_path = f'/content/drive/MyDrive/ML/Fibre/trial3_best_ensemble_model.pth'
              torch.save(checkpoint, best_model_path)
              print(f'New best model saved! Val Acc: {best_val_acc:.2f}%')
  
  
          else:
              patience_counter += 1
              print(f'No improvement for {patience_counter}/{early_stopping_patience} epochs')
  
  
          # Check early stopping
          if patience_counter >= early_stopping_patience:
              print(f'Early stopping triggered at epoch {epoch+1}')
              print(f'Best validation accuracy: {best_val_acc:.2f}% at epoch {best_epoch+1}')
              break
  
  
          # Save regular checkpoint every 10 epochs
          if (epoch + 1) % 10 == 0:
              checkpoint_path = f'/content/drive/MyDrive/ML/Fibre/trial3_checkpoint_epoch_{epoch+1}.pth'
              torch.save(checkpoint, checkpoint_path)
  
  
      print("")
      print("="*80)
      print("TRIAL 3 TRAINING COMPLETED!")
      print("="*80)
      print(f"Best Validation Accuracy: {best_val_acc:.2f}% (Epoch {best_epoch+1})")
      print(f"Final Training Accuracy: {train_accuracies[-1]:.2f}%")
      print(f"Train-Val Gap: {abs(train_accuracies[-1] - best_val_acc):.2f}%")
  
  
      if patience_counter >= early_stopping_patience:
          print(f"Training stopped early due to no improvement")
      else:
          print(f"Training completed all {num_epochs} epochs")
  
  
      print("="*80)
  
  
      # Plot training curves
      plt.figure(figsize=(15, 5))
  
  
      plt.subplot(1, 3, 1)
      plt.plot(train_losses, 'b-', linewidth=2, label='Train')
      plt.plot(val_losses, 'r-', linewidth=2, label='Validation')
      plt.title('Training & Validation Loss', fontsize=14)
      plt.xlabel('Epoch')
      plt.ylabel('Loss')
      plt.legend()
      plt.grid(True)
  
  
      plt.subplot(1, 3, 2)
      plt.plot(train_accuracies, 'b-', linewidth=2, label='Train')
      plt.plot(val_accuracies, 'r-', linewidth=2, label='Validation')
      plt.title('Training & Validation Accuracy', fontsize=14)
      plt.xlabel('Epoch')
      plt.ylabel('Accuracy (%)')
      plt.legend()
      plt.grid(True)
  
  
      plt.subplot(1, 3, 3)
      class_counts = Counter([label for _, label in full_dataset.samples])
      plt.bar(range(len(full_dataset.classes)),
              [class_counts[i] for i in range(len(full_dataset.classes))])
      plt.title('Class Distribution', fontsize=14)
      plt.xlabel('Class Index')
      plt.ylabel('Number of Samples')
      plt.xticks(range(len(full_dataset.classes)), full_dataset.classes, rotation=45)
      plt.grid(True)
  
  
      plt.tight_layout()
      plt.show()
  
  
      return model, full_dataset.classes, best_val_acc, train_accuracies[-1]

Results

Trial 3 shows a better overall performance.

Metric	Trial 1	Trial 3	Improvement
Accuracy	0.7120	0.7341	+0.0221 (+2.21%)
Balanced Accuracy	0.8678	0.8298	-0.0380 (-3.80%)
F1 Macro	0.7848	0.8149	+0.0302 (+3.02%)
F1 Weighted	0.7001	0.7312	+0.0311 (+3.11%)
Precision Macro	0.7493	0.8095	+0.0602 (+6.02%)
Recall Macro	0.8678	0.8293	-0.0380 (-3.80%)

Trial 3 shows a stronger overall performance compared to Trial 1. The model makes more correct predictions overall, achieves a better balance between precision and recall, and produces fewer false positives, which makes its outputs more reliable. Precision saw the biggest gain (+6%). This means that the model is being more conservative. It predicts a class only when it’s more confident, which reduces false positives and increases precision though also missing some true positives, leading to decrease in recall. Although it captures slightly fewer true positives in some categories, the overall improvement in accuracy and consistency indicates that Trial 3 is the more effective model.

Interface of FiberSense

https://gitlab.igem.org/2025/software-tools/gems-taiwan

Software

Introduction

Introducing Our Database

Observations

Color

Unidentifiable Objects

Skin Exposure

Unwanted categories

Data Processing

Data cleansing

Model Structure

Trial 1

Trial 1

Trial 2

Trial 2

Results:

Challenges

Analyzing the Data

Trial 1 verses Trial 2

Conclusion

Final Model

Trial 3

Trial 3

Results

Interface of FiberSense

References