Robustness Evaluation Suite
This page documents Robustness Evaluation Suite.
AdversarialAttacks.RobustnessReport — Type
RobustnessReportReport on model robustness against an Adversarial Attack. Printing a RobustnessReport (via println(report)) displays a nicely formatted summary including clean/adversarial accuracy, attack success rate, and robustness score.
Fields
num_samples::Int: Total samples evaluatednum_clean_correct::Int: Samples correctly classified before attackclean_accuracy::Float64: Accuracy on clean samplesadv_accuracy::Float64: Accuracy on adversarial samplesattack_success_rate::Float64: (ASR) Fraction of successful attacks (on correctly classified samples)robustness_score::Float64: 1.0 - attacksuccessrate (ASR)num_successful_attacks::Int: Number of successful attackslinf_norm_max::Float64: Maximum L∞ norm of perturbations across all sampleslinf_norm_mean::Float64: Mean L∞ norm of perturbations across all samplesl2_norm_max::Float64: Maximum L2 norm of perturbations across all samplesl2_norm_mean::Float64: Mean L2 norm of perturbations across all samplesl1_norm_max::Float64: Maximum L1 norm of perturbations across all samplesl1_norm_mean::Float64: Mean L1 norm of perturbations across all samples
Note
An attack succeeds when the clean prediction is correct but the adversarial prediction is incorrect.
- The L∞ norm measures the maximum absolute change in any feature of the input.
- The L2 norm measures the Euclidean distance between original and adversarial samples.
- The L1 norm measures the Manhattan distance (sum of absolute differences).
AdversarialAttacks.calculate_metrics — Method
calculate_metrics(n_test, num_clean_correct, num_adv_correct,
num_successful_attacks, l_norms)Compute accuracy, attack success, robustness, and perturbation norm statistics for adversarial evaluation.
Arguments
n_test: Number of test samplesnum_clean_correct: Number of correctly classified clean samplesnum_adv_correct: Number of correctly classified adversarial samplesnum_successful_attacks: Number of successful adversarial attacksl_norms: Dictionary containing perturbation norm arrays with keys:linf,:l2, and:l1
Returns
- A
RobustnessReportcontaining accuracy, robustness, and norm summary metrics (maximum and mean) for all three norm types.
AdversarialAttacks.compute_norm — Method
compute_norm(sample_data, adv_data, p::Real)Compute the Lp norm of the perturbation between original data and adversarial data.
This function uses LinearAlgebra.norm for optimal performance and numerical stability.
Arguments
sample_data: Original sample data.adv_data: Adversarially perturbed version ofsample_data.p::Real: Order of the norm. Must be positive orInf.- Common values:
1(Manhattan/L1),2(Euclidean/L2),Inf(maximum/L∞).
- Common values:
Returns
Float64: The Lp norm of the perturbation||adv_data - sample_data||_p.
Examples
original = [1.0, 2.0, 3.0]
adversarial = [1.5, 2.5, 3.5]
compute_norm(original, adversarial, 2) # L2 (Euclidean) norm
compute_norm(original, adversarial, 1) # L1 (Manhattan) norm
compute_norm(original, adversarial, Inf) # L∞ (maximum) normReferences
- Lp space: https://en.wikipedia.org/wiki/Lp_space
AdversarialAttacks.evaluate_robustness — Method
evaluate_robustness(model, atk, test_data; num_samples=100)Evaluate model robustness by running attack on multiple samples.
For each sample, computes clean and adversarial predictions, tracks attack success, and calculates perturbation norms (L∞, L2, and L1).
Arguments
model: The model to evaluate.atk: The attack to use.test_data: Collection of test samples.num_samples::Int=100: Number of samples to test. If more than available samples, uses all available samples.
Returns
RobustnessReport: Report containing accuracy, attack success rate, robustness metrics,
and perturbation statistics for L∞, L2, and L1 norms.
Example
report = evaluate_robustness(model, FGSM(epsilon=0.1), test_data, num_samples=50)
println(report)AdversarialAttacks.evaluation_curve — Method
evaluation_curve(model, atk_type, epsilons, test_data; num_samples=100)Evaluate model robustness across a range of attack strengths.
For each value in epsilons, an attack of type atk_type is instantiated and used to compute clean accuracy, adversarial accuracy, attack success rate, robustness score, and perturbation norms (L∞, L2, and L1).
Arguments
model: Model to be evaluated.atk_type: Adversarial Attack type.epsilons: Vector of attack strengths.test_data: Test dataset.
Keyword Arguments
num_samples::Int=100: Number of samples used for each epsilon evaluation.
Returns
- A dictionary containing evaluation metrics for each epsilon value:
:epsilons: Attack strength values.:clean_accuracy: Clean accuracy for each epsilon.:adv_accuracy: Adversarial accuracy for each epsilon.:attack_success_rate: Attack success rate for each epsilon.:robustness_score: Robustness score (1 - ASR) for each epsilon.:linf_norm_mean,:linf_norm_max: L∞ norm statistics.:l2_norm_mean,:l2_norm_max: L2 norm statistics.:l1_norm_mean,:l1_norm_max: L1 norm statistics.
Example
results = evaluation_curve(model, FGSM, [0.01, 0.05, 0.1], test_data, num_samples=100)
println("Attack success rates: ", results[:attack_success_rate])using AdversarialAttacks
println("RobustnessReport fields: ", fieldnames(RobustnessReport))RobustnessReport fields: (:num_samples, :num_clean_correct, :clean_accuracy, :adv_accuracy, :attack_success_rate, :robustness_score, :num_successful_attacks, :linf_norm_max, :linf_norm_mean, :l2_norm_max, :l2_norm_mean, :l1_norm_max, :l1_norm_mean)