clip_gradient_by_quantile.Rd
This function provides an adaptive method for controlling gradient magnitudes. It calculates a threshold based on a specified quantile of the absolute values within the gradient matrix itself. Any value exceeding this dynamic threshold is "clipped" or shrunk back to the threshold, preserving its original sign.
clip_gradient_by_quantile(gradient, quantile = 0.8)
A new gradient matrix with extreme values clipped.
This is more robust than a fixed threshold as it automatically adapts to the scale of the gradients at each optimization step.
# Create a gradient with some large outlier values
set.seed(123)
grad_matrix <- matrix(rnorm(100, mean = 0, sd = 1), 10, 10)
grad_matrix[1, 1] <- 10 # Large positive outlier
grad_matrix[5, 5] <- -12 # Large negative outlier
# Clip at the 80th percentile. This will tame the outliers.
clipped_grad <- clip_gradient_by_quantile(grad_matrix, quantile = 0.80)
cat("Original Gradient Range:\n")
#> Original Gradient Range:
print(range(grad_matrix))
#> [1] -12 10
cat("\n80th Percentile Threshold:\n")
#>
#> 80th Percentile Threshold:
# The threshold will be the 80th percentile of the absolute values
print(quantile(abs(grad_matrix), probs = 0.80))
#> 80%
#> 1.221391
cat("\nClipped Gradient Range:\n")
#>
#> Clipped Gradient Range:
# The new range will be capped at the threshold
print(range(clipped_grad))
#> [1] -1.221391 1.221391