Skip to content

Agc improvements and improve gain control stability#882

Draft
UnknownSuperficialNight wants to merge 7 commits into
RustAudio:masterfrom
UnknownSuperficialNight:agc-improvements-and-bug-fix
Draft

Agc improvements and improve gain control stability#882
UnknownSuperficialNight wants to merge 7 commits into
RustAudio:masterfrom
UnknownSuperficialNight:agc-improvements-and-bug-fix

Conversation

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor

@UnknownSuperficialNight UnknownSuperficialNight commented May 11, 2026

This PR focusses mostly on adding stability to AGC through the slowdown_factor and miscellaneous improvements.

I've been experimenting with the AGC to find ways to stabilise it. This is the result.

The compute_slowdown_factor functions as a third control layer that measures proximity to the target gain alongside standard RMS and peak metrics. It acts as a dynamic throttle, adjusting the AGC rate of change based on how close the signal is to the desired level. The slowdown logic activates only when the current gain falls within the combined RMS+peak tolerance window relative to the target. When the input is loud, the tolerance window widens; with quieter signals, it contracts.

Inside this boundary, exponential scaling prevents the harsh jumps and oscillations that occurred with fixed-rate adjustments. As the signal approaches the target, the slowdown increases to reduce the AGC rate of change and produce smoother behaviour. Outside this zone, the AGC uses normal responsiveness, which allows for more rapid correction when needed. The tolerance window is bounded by the combined RMS+peak metric.

By managing these ranges, the system enables faster attack times without flattening audio dynamics. Previously, aggressive speeds would normalise all sounds to a flat line. Now the AGC can accelerate adjustments when far from the target but slows down exponentially as it approaches the goal. This preserves audio depth while maintaining stability: quick reactions when needed, with gradual stabilisation near the final level, preventing gain overshoot and sudden volume spikes that can occur with fixed-rate adjustments.

update_peak_level Optimisation

This function was a performance hotspot due to per-sample allocation and branching. Previously, we computed a conditional coefficient for each sample: a fast attack coefficient (0.0) when the sample exceeded the peak, and a slow release coefficient otherwise.

I've replaced this with a branchless implementation that uses a fixed release_coefficient (which is always cached), eliminating the per-sample if branch and allocation.

Before (Slow, Branching + Allocation):

// This was allocating each sample
let coeff = if sample_value > self.peak_level {
    // Fast attack for rising peaks
    0.0
} else {
    // Slow release for falling peaks
    release_coeff
};

Other changes in this PR

  • CircularBufferRMS now uses sum-of-squares internally and is cleaned up.
  • Attack and release times are now raw floats instead of coefficients.
  • Added div_or_fallback helper to safely divide by non-NaN, non-infinite, positive values.
  • NaN guards added to RMS and peak logic to prevent either from getting corrupted.
  • Added fast_exp helper using Horner's method for exp(x) approximation in compute_slowdown_factor.

Benchmarks

Benchmarks before:

Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  12.38 ms      │ 13.49 ms      │ 12.48 ms      │ 12.54 ms      │ 100     │ 100

Benchmarks after the changes and redesign:

Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  9.145 ms      │ 12.98 ms      │ 9.209 ms      │ 9.408 ms      │ 100     │ 100

Concerns

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.

These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

The root cause is with the Libopus decoder, as far as I can tell, which should not output values above 1.0 in the first place.

This is probably worth investigating: is this behaviour by design in Libopus, or is there something wrong upstream of the effect?

Potential Improvements

  • Lookahead Buffer: Rodio does not natively support this, but adding a buffer would allow gradual pre-amplitude gain adjustment before a spike/kick occurs.
  • Dynamic Buffer Size: Adjust size based on sample rate to maintain a consistent ~20ms window (e.g., 2048 for 96kHz). This ensures the buffer remains consistent.
    Pseudocode Example:
fn buffer_size(sample_rate: u32) -> usize {
    match sample_rate {
        96_000 => 2048,
        192_000 => 4096,
        _ => 1024,
    }
}
  • Speech Profile: Adding a profile tune for dedicated speech to AutomaticGainControlSettings might be a good idea.

Video Comparison

Before:

before_normal.mp4

After:

after_normal.mp4

Before near the loudness limit

near_limit_before.mp4

After near the loudness limit

near_limit_after.mp4

Additional notes

This can be tuned back to how it worked originally if users preferred the more normalised sound.
It might even be worth adding a toggle for the slowdown then we can disable it.

…bility

- Replace coefficient-based `attack/release` with direct `Duration` types
- Reduce `RMS_WINDOW_SIZE` from `8192` to `512` samples to lower latency
- Switch RMS calculation from mean-based buffer (`CircularBuffer`) to sum-of-squares approach in `CircularBufferRMS` for accurate root-mean-square values
- Introduce `SlowDownState` struct that manages timing and caching: counts samples in 2ms blocks, computes adaptive `slowdown_factor` using `compute_slowdown_factor` and caches the result for reuse
- Implement `fast_exp` using Horner's method for efficient exponential approximation of release coefficients (third-order Taylor polynomial)
- Add `NaN` handling in RMS calculation to prevent invalid values
- Add rate limiting to gain changes: clamp gain change per sample based on dynamic attack/release duration to prevent overshooting
- Add new `peak_tracking_window` setting to control peak level smoothing
- Tune default timing parameters: 500ms attack, 0.5ms release, 10ms peak tracking window for balanced behaviour
…calculation

- Replace hardcoded `1.0` fallback with `self.current_gain` when `RMS` equals `0.0`
- Add comment explaining this keeps gain stable or allows gradual decay instead of sudden drops
- Cap peak tracking at 1.0 to handle out-of-bounds decoder samples
- Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values
- Cap rms tracking at 1.0 to handle out-of-bounds decoder samples
- Ensure samples from decoders that are not normalised like `libopus` do not track out-of-bounds values
- Change `RMS_WINDOW_SIZE` constant from `512` to `1024`
- 1024 samples provides ~23ms window at 44.1kHz / ~21ms at 48kHz for stable RMS estimation
Comment thread src/source/agc.rs
release_time: Duration::from_secs(0), // Recommended release time
absolute_max_gain: 7.0, // Recommended max gain
target_level: 1.0, // Default to original level
attack_time: Duration::from_millis(500), // Recommended attack time
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be too low I found 500ms or 800ms to be quite nice would like some feedback on this is if possible

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have no idea what works best for the new algorithm. For speech quiet fast was useful

Comment thread src/source/agc.rs
Comment thread src/source/agc.rs Outdated
Copy link
Copy Markdown
Member

@yara-blue yara-blue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like the idea for multiple profiles. Ideally we also give the "current" default a name, maybe "Music" and "Speech"?

Comment thread src/source/agc.rs
Comment thread src/source/agc.rs Outdated
Comment thread src/source/agc.rs
Comment thread src/source/agc.rs
release_time: Duration::from_secs(0), // Recommended release time
absolute_max_gain: 7.0, // Recommended max gain
target_level: 1.0, // Default to original level
attack_time: Duration::from_millis(500), // Recommended attack time
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have no idea what works best for the new algorithm. For speech quiet fast was useful

Comment thread src/source/agc.rs Outdated
@roderickvd
Copy link
Copy Markdown
Member

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.

These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

Values outside of -1.0..=1.0 aren't out-of-range for a DSP pipeline. Strange as it may be from libopus itself, the beauty of working in normalized floating point is that it never clips until it's finally converted to integer. A chain of Rodio filters itself could also return values > 1.0 even if the decoder wouldn't.

Long story short, we should deal with such values without clipping them.

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 12, 2026

The Libopus decoder can output samples above 1.0, such as 1.1, 1.064, and similar values, for both RMS and peak readings depending on the track. This behaviour is not observed with the FLAC decoder.
These out-of-range samples cause errors downstream, particularly when offsetting the current gain below 1.0 while targeting 1.0. I've added .min(1.0) to ensure the gain never exceeds the cap/limit for RMS and peak.

Values outside of -1.0..=1.0 aren't out-of-range for a DSP pipeline. Strange as it may be from libopus itself, the beauty of working in normalized floating point is that it never clips until it's finally converted to integer. A chain of Rodio filters itself could also return values > 1.0 even if the decoder wouldn't.

Long story short, we should deal with such values without clipping them.

Any ideas on this?

First thing that comes to mind though I could be wrong is something like this self.peak_level.max(1.0); and storing that per sample like this basically

let full_scale = self.peak_level.max(1.0);

// Calculate max gain change per sample based on dynamic attack/release times
let max_attack_gain_change_per_sample = full_scale / (dynamic_attack_time * sample_rate);
let max_release_gain_change_per_sample = full_scale / (release_duration * sample_rate);

Basically go through and compute a new max per sample and scale for that.

Just throwing ideas out there.

Would probably have to go through it all again possibly and remove the 1.0 assumption

@yara-blue
Copy link
Copy Markdown
Member

Any ideas on this?

Long story short, we should deal with such values without clipping them.

AGC maps input to the range [-1.0, 1.0]. To do so without clipping it needs the width of the input range. It can't look ahead to see what other samples will be emitted and thus what the peak is. All I can think of is to assume the input range to be unrealistically big, say: [-1.5, 1.5]? Is that unrealistically big?

Basically go through and compute a new max per sample and scale for that.

Lets look at some extreme, what would happen if halfway through playback one single sample peaks really high, lets say 10.0? Would everything get quieter after that sample?

@roderickvd
Copy link
Copy Markdown
Member

Please excuse me responding a bit theoretically without recent study of the current implementation:

The gain calculation should fundamentally be the ratio target / measured, regardless of whether the peak is 0.8, 1.0, or 1.1. No fixed ceiling should be needed. Instead of clipping with .min(1.0) we should track the true peak. If the peak is 1.1, the AGC should respond with a gain below 1.0.

If the root cause is that the code assumes 1.0 as ceiling, then ideally we should remove those assumptions.

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

Any ideas on this?

Long story short, we should deal with such values without clipping them.

AGC maps input to the range [-1.0, 1.0]. To do so without clipping it needs the width of the input range. It can't look ahead to see what other samples will be emitted and thus what the peak is. All I can think of is to assume the input range to be unrealistically big, say: [-1.5, 1.5]? Is that unrealistically big?

I was thinking about a running maximum where we track each sample and if we get a sample that exceeds, we will replace the old running maximum with the new one. That was my original idea anyway.

Basically go through and compute a new max per sample and scale for that.

Lets look at some extreme, what would happen if halfway through playback one single sample peaks really high, lets say 10.0? Would everything get quieter after that sample?

Possibly I guess that would depend on the implementation maybe there is a peak decay after n samples etc…

Though neither of these seem like a proper solution I must admit.

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 13, 2026

Please excuse me responding a bit theoretically without recent study of the current implementation:

The gain calculation should fundamentally be the ratio target / measured, regardless of whether the peak is 0.8, 1.0, or 1.1. No fixed ceiling should be needed. Instead of clipping with .min(1.0) we should track the true peak. If the peak is 1.1, the AGC should respond with a gain below 1.0.

This is what happens without the min(1.0)when the RMS or peak go over 1.0 the AGC dips below 1.0 and or spikes of dropping volume.

Though this could be an issue as we could get, for example, dips to 0.7 gain and dropping volume.

My approach was to, by default, limit the gain to 1.0/target so in other words, if it's max or very close to max, the gain should be around source so that the sound is the same as the original. Then, if the input clips, the AGC should clip (stay the same as source gain) and if the input does not clip, the AGC does not clip in other words, default to what the original sound was.

One thing we could do here is remove the min(1.0) and let it fall below 1.0 if calculated RMS or PEAK go above 1.0 but for default, use the limiter to limit gain to 1.0 then this way if people want gain dropping below 1.0 they can set the limiter to 0.0 while by default it stops at source aka 1.0

If the root cause is that the code assumes 1.0 as ceiling, then ideally we should remove those assumptions.

It would be ideal though. However, how would we scale the PEAK and RMS then? There needs to be a scale somewhere. I guess a true peak would be it, but that would only really be possible with a running maximum as we cannot pre-process the file to find a true peak, nor can we look ahead with a look-ahead buffer.

@roderickvd
Copy link
Copy Markdown
Member

The ratio arithmetic handles 1.1 as naturally as 0.8; that's not the issue. The issue is pumping: the AGC attenuates correctly on a spike, then takes release_time to recover. This is the case regardless of whether values exceed 1.0. Keeping a running maximum with decay seems the right direction: fast attack on the tracked peak with slow release.

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 13, 2026

The ratio arithmetic handles 1.1 as naturally as 0.8; that's not the issue. The issue is pumping: the AGC attenuates correctly on a spike, then takes release_time to recover. This is the case regardless of whether values exceed 1.0. Keeping a running maximum with decay seems the right direction: fast attack on the tracked peak with slow release.

So in other words you think that it should allow spikes below the current gain but then right after bounce back into 1.0 range when we are not peaking?

RMS: 0.2175, Peak: 0.2972, Desired Gain: 3.3644, Current Gain: 1.1774, Release Coefficient: 0.99791884, Attack Time: 0.5000
RMS: 0.2067, Peak: 0.4405, Desired Gain: 2.2703, Current Gain: 1.1814, Release Coefficient: 0.99791884, Attack Time: 0.5000
RMS: 0.2322, Peak: 0.7768, Desired Gain: 1.2874, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 103.1927
RMS: 0.2611, Peak: 1.2740, Desired Gain: 0.7849, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 236.4415
RMS: 0.2698, Peak: 1.0879, Desired Gain: 0.9192, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 138.3390
RMS: 0.2740, Peak: 0.9332, Desired Gain: 1.0716, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 66.1844
RMS: 0.3013, Peak: 0.9244, Desired Gain: 1.0818, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 66.0648
RMS: 0.3073, Peak: 0.7932, Desired Gain: 1.2608, Current Gain: 0.7576, Release Coefficient: 0.99791884, Attack Time: 21.9546
RMS: 0.3330, Peak: 0.8451, Desired Gain: 1.1833, Current Gain: 0.7577, Release Coefficient: 0.99791884, Attack Time: 39.6588

Into something more like this

RMS: 0.2175, Peak: 0.2972, Desired Gain: 3.3644, Current Gain: 1.1774, Release Coefficient: 0.99791884, Attack Time: 0.5000
RMS: 0.2067, Peak: 0.4405, Desired Gain: 2.2703, Current Gain: 1.1814, Release Coefficient: 0.99791884, Attack Time: 0.5000
RMS: 0.2322, Peak: 0.7768, Desired Gain: 1.2874, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 103.1927
RMS: 0.2611, Peak: 1.2740, Desired Gain: 0.7849, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 236.4415
RMS: 0.2698, Peak: 1.0879, Desired Gain: 0.9192, Current Gain: 0.7575, Release Coefficient: 0.99791884, Attack Time: 138.3390

// Right back up to normal  ranges after the spike as a example?
RMS: 0.2740, Peak: 0.9332, Desired Gain: 1.0716, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 66.1844
RMS: 0.3013, Peak: 0.9244, Desired Gain: 1.0818, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 66.0648
RMS: 0.3073, Peak: 0.7932, Desired Gain: 1.2608, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 21.9546
RMS: 0.3330, Peak: 0.8451, Desired Gain: 1.1833, Current Gain: 1.1854, Release Coefficient: 0.99791884, Attack Time: 39.6588

@roderickvd
Copy link
Copy Markdown
Member

Not quite. The goal isn't to snap current gain back; it's to smooth the desired gain so it doesn't swing as wildly. Something like this (illustrative numbers):

RMS: 0.2611, Peak: 1.2740, Tracked Peak: 1.2740, Desired Gain: 0.7849, Current Gain: 0.7575
RMS: 0.2698, Peak: 1.0879, Tracked Peak: 1.2358, Desired Gain: 0.8092, Current Gain: 0.7650
RMS: 0.2740, Peak: 0.9332, Tracked Peak: 1.1987, Desired Gain: 0.8342, Current Gain: 0.7740
RMS: 0.3013, Peak: 0.9244, Tracked Peak: 1.1627, Desired Gain: 0.8601, Current Gain: 0.7840

Current gain climbs gradually as the tracked peak decays without abrupt jumps. You might be able to just remove the .min(1.0) so it can track peaks above 1.0?

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 15, 2026

Not quite. The goal isn't to snap current gain back; it's to smooth the desired gain so it doesn't swing as wildly. Something like this (illustrative numbers):

RMS: 0.2611, Peak: 1.2740, Tracked Peak: 1.2740, Desired Gain: 0.7849, Current Gain: 0.7575
RMS: 0.2698, Peak: 1.0879, Tracked Peak: 1.2358, Desired Gain: 0.8092, Current Gain: 0.7650
RMS: 0.2740, Peak: 0.9332, Tracked Peak: 1.1987, Desired Gain: 0.8342, Current Gain: 0.7740
RMS: 0.3013, Peak: 0.9244, Tracked Peak: 1.1627, Desired Gain: 0.8601, Current Gain: 0.7840

Current gain climbs gradually as the tracked peak decays without abrupt jumps. You might be able to just remove the .min(1.0) so it can track peaks above 1.0?

So, basically, like the current functionality without the min(1.0) but use a tracked peak as it decays from the peak to let the gain go up.

Because currently, with the min(1.0) removed that happens here pretty much when a peak goes above 1.0 it offsets current gain and desired gain then once the peaks drop below 1.0 it starts to slowly rise towards Desired Gain and desired gain is high by that point as its no longer peaking

2026-05-14_21-49

As you can see in the highlighted areas in the image, this happens as you described, with just the normal peak acting as the tracked peak already without min(1.0). Basically, the same thing happens.

NOTE: each line is only printed every 2ms so the jumps are way smoother than shown in the image.

This could be tuned since the attack time is affecting the Current Gain speed.

But a true peak with decay would be better/more accurate than the EMA for this.

Video example:

adadad.mp4

The current AGC can only go so fast up as it uses the attack time while moving upwards. Thus, it's already smoothed. We could just add a conditional that applies a certain attack time to control the rate of growth during < 1.0 then we don't mess with the desired gain too much and reuse the attack time, so this should be relatively efficient. It's one possible way.

An example here is that, with attack time disabled, it rises normally similar to what you described. We could just override the attack time to a low value to precisely control the recovery rate during a peak. That's another solution.

sdad

However, there is a flaw in my opinion. Here is an example:

With limit: (Sounds better)

1.mp4

Without limit: (Sounds Worse)

2.mp4

In the 2nd video, Without limit you can see it peaking enough to constantly have it down below the source levels intended by the artist.

That is why I preferred to limit it to 1.0 as 1.0 is the artist's original intention / non-boosted

Currently, when it goes above peak, it just defaults to basically source input samples as it should be around there already if it's that loud.

Any song with distortion or intentional peaking/clipping is going to be reduced below source, and it will not sound like the original artist's intention.

That's my biggest issue with allowing gain below 1.0 without setting the target below 1.0.

@roderickvd
Copy link
Copy Markdown
Member

Great to see that analysis.

Yeah, I agree that the ramp-up should already work as needed, with fast attack and slow release. Without that .min(1.0) is exactly the tracked-peak behavior I was proposing.

The "artist's intention" argument assumes AGC is the only filter in the chain, but that's not a valid assumption. An amplify(1.5) before it pushes samples above 1.0 before they reach AGC. That's not clipping, just floating-point working as intended. AGC has to handle whatever its input delivers.

If a user wants a ceiling at source level, wouldn't that already be the existing floor field?

- Extract `fast_exp` function from `agc.rs` to `math.rs`
- Export `fast_exp` as `pub(crate)` for reuse across the codebase
- Update imports in `agc.rs` to use the shared `fast_exp`
@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 15, 2026

Yeah, I agree that the ramp-up should already work as needed, with fast attack and slow release. Without that .min(1.0) is exactly the tracked-peak behavior I was proposing.

Do you think we should bother with a true peak or just use peak as is, since it's good enough to double as true peak?

The "artist's intention" argument assumes AGC is the only filter in the chain, but that's not a valid assumption. An amplify(1.5) before it pushes samples above 1.0 before they reach AGC. That's not clipping, just floating-point working as intended. AGC has to handle whatever its input delivers.

True.

If a user wants a ceiling at source level, wouldn't that already be the existing floor field?

True, it should and would make both use cases possible. However, currently it stands, it's not working as intended. Peak and RMS are taking precedence over the floor. It's something I still need to fix.

Turns out I have already fixed it though what do you think about the default floor value at 1.0 or 0.0

Personally, I prefer 1.0, which people can then overwrite to their desired level, but I might be a bit biased.

What do you think?

It might even be worth adding a toggle for the slowdown_factor then we can disable it.

What do you think of that being an option just to make it more flexible and modularised?

As slowdown_factor is much more helpful when limited to 1.0 gain, while depending on what the user wants below 1.0 it might hurt or be a benefit.

@roderickvd
Copy link
Copy Markdown
Member

Do you think we should bother with a true peak or just use peak as is?

Using peak as-is should be fine.

True peak could be an option if users would know it, for example from ReplayGain metadata. I'm saying could be because this could also be moving into "You Ain't Gonna Need It" territory. Arguably, for music with ReplayGain, a limiter may be preferred over AGC.

Turns out I have already fixed it though what do you think about the default floor value at 1.0 or 0.0

Personally, I prefer 1.0, which people can then overwrite to their desired level, but I might be a bit biased.

What's your rationale for 1.0?

Not knowing that, I'd lean 0.0. With target_level = 1.0 as default and floor = 1.0, the AGC effectively becomes "amplify only". That's a useful mode but it's a quiet-passage booster, not really AGC.

slowdown_factor as a toggle?

Could be interesting. At the same time I'm thinking if and how we could relate it to release_duration.

@UnknownSuperficialNight
Copy link
Copy Markdown
Contributor Author

UnknownSuperficialNight commented May 15, 2026

Do you think we should bother with a true peak or just use peak as is?

Using peak as-is should be fine.

True peak could be an option if users would know it, for example from ReplayGain metadata. I'm saying could be because this could also be moving into "You Ain't Gonna Need It" territory. Arguably, for music with ReplayGain, a limiter may be preferred over AGC.

Turns out I have already fixed it though what do you think about the default floor value at 1.0 or 0.0
Personally, I prefer 1.0, which people can then overwrite to their desired level, but I might be a bit biased.

What's your rationale for 1.0?

Not knowing that, I'd lean 0.0. With target_level = 1.0 as default and floor = 1.0, the AGC effectively becomes "amplify only". That's a useful mode but it's a quiet-passage booster, not really AGC.

I use it for all my music and with the slowdown_factor included, it basically defaults to 1.0 if the song is loud enough for me personally, I'm looking to keep as is/source when loud, but if the song is not normalised or quiet it boosts with AGC, but loud parts go right back to 1.0 with the slowdown_factor it basically becomes the same as source for loud sounds it's in the 1.0001 to 1.00001 range so basically source.

This comes down mostly to what is the most popular use case. Would people rather source as a default and manually change it otherwise, or are more people going to need it uncapped by default whichever is the more popular use case if say limit should be that.

For me, I see my use case as a pretty popular one, but I may be wrong.

quiet-passage booster, not really AGC.

At that point its kind of is, but this is only for music that is mixed/mastered well for songs that are not it acts like an AGC giving headroom to adjust the gain up or down to maintain a consistent level.

slowdown_factor as a toggle?

Could be interesting. At the same time I'm thinking if and how we could relate it to release_duration.

I tried that with release_duration it did not go well before I made this PR. I experimented with that very thing and also made release_duration dynamic, but I found for music it's a horrible idea. Most songs do not clip, but no matter how low I tuned it, it clipped on around 10% of songs and added extra latency. Hence, I scraped it and figured a solid controllable release_duration is just better, though it could possibly be good for voice, but that would be language dependant as some languages have plosive consonants that would peak if said plosive consonant was preceded by a quiet part or at the start of a word.

Originally, I made it 0.0 and just had it there for people who need it for their specific use cases, but 0.0 worked but did cause issues with clipping rarely as it would react instantly, so I did a trial and errored it to find where it does not clip any more and that is the default now.

If we had a lookahead buffer, this would be a good addition, as then we could see a peak coming and lower in time smoothly to not peak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants