feat(algorithms): add bubble, insertion, counting, and quicksort#570
Open
Robertboy18 wants to merge 4 commits into
Open
feat(algorithms): add bubble, insertion, counting, and quicksort#570Robertboy18 wants to merge 4 commits into
Robertboy18 wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds foundational sorting infrastructure and several list-sorting algorithms (with correctness/stability and comparison-count cost models) to the Lean portion of the library, including support for randomized timed computations.
Changes:
- Introduce
StableByValueas a general stability predicate for list sorting results. - Add deterministic stable sorts (insertion sort, bubble sort), counting sort for
ℕ, and randomized quicksort with expected-time results. - Add
RandomTimeMutilities for expectations overPMF (TimeM ...), and re-export new modules fromCslib.lean.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| Cslib/Algorithms/Lean/Sorting.lean | Adds a stability predicate (StableByValue) used by stable sort proofs. |
| Cslib/Algorithms/Lean/RandomTimeM.lean | Introduces RandomTimeM plus expected value/time definitions and lemmas. |
| Cslib/Algorithms/Lean/QuickSort/QuickSort.lean | Adds randomized quicksort, correctness proofs, and worst/expected-time bounds. |
| Cslib/Algorithms/Lean/InsertionSort/InsertionSort.lean | Adds stable insertion sort with correctness/stability and time bounds. |
| Cslib/Algorithms/Lean/CountingSort/CountingSort.lean | Adds counting sort for ℕ with correctness/stability and exact time formula. |
| Cslib/Algorithms/Lean/BubbleSort/BubbleSort.lean | Adds stable bubble sort with correctness/stability and time bounds. |
| Cslib.lean | Re-exports the new sorting and randomized-time modules from the umbrella module. |
| pointwise worst-case theorem. | ||
| -/ | ||
| theorem randomizedQuickSort_expectedTime_le_quadratic (xs : List α) : | ||
| RandomTimeM.expectedTime (randomizedQuickSort xs) ≤ (2 * xs.length * xs.length : ℕ) := by |
| (base : ENNReal) + p.expectedTimeENNReal := by | ||
| rw [expectedTimeENNReal_bind] | ||
| simp only [expectedTimeENNReal, expectedNat_pure] | ||
| rw [← expectedNat] |
| ⟨sortedPrefix.ret ++ [pass.ret.2], pass.time + sortedPrefix.time⟩ | ||
| termination_by xs => xs.length | ||
| decreasing_by | ||
| simp_wf |
Comment on lines
+156
to
+173
| /-- Expected running time is the weighted average of the time component of each result. -/ | ||
| noncomputable def expectedTime (p : RandomTimeM ℕ α) : ℝ := | ||
| expectedValue p fun result => result.time | ||
|
|
||
| /-- Expected running time as an extended nonnegative real. -/ | ||
| noncomputable def expectedTimeENNReal (p : RandomTimeM ℕ α) : ENNReal := | ||
| expectedNat p fun result => result.time | ||
|
|
||
| /-- The real and `ENNReal` expected-time views agree after converting back to `ℝ`. -/ | ||
| theorem expectedTime_eq_toReal_expectedTimeENNReal (p : RandomTimeM ℕ α) : | ||
| p.expectedTime = p.expectedTimeENNReal.toReal := by | ||
| unfold expectedTime expectedTimeENNReal expectedValue expectedNat | ||
| rw [ENNReal.tsum_toReal_eq] | ||
| · simp [ENNReal.toReal_mul] | ||
| · intro result | ||
| exact ENNReal.mul_ne_top | ||
| (ne_of_lt ((p.coe_le_one result).trans_lt ENNReal.one_lt_top)) | ||
| (by exact_mod_cast (WithTop.coe_ne_top : (result.time : ENNReal) ≠ ⊤)) |
| namespace Cslib.Algorithms.Lean | ||
|
|
||
| /-- `ys` preserves the order of equal values from `xs`. -/ | ||
| abbrev StableByValue {α : Type} [DecidableEq α] (xs ys : List α) : Prop := |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I wanted to contribute more of the standard sorting algorithms alongside the existing MergeSort development. This PR adds stable bubble sort, stable insertion sort, counting sort for natural-number lists, and a duplicate-safe randomized quicksort, all using the existing TimeM style so the algorithms come with executable definitions, correctness theorems, and comparison-cost bounds.
For stability, I added a shared StableByValue predicate. The idea is that if we filter the input and output by any fixed value, we should get the same subsequence. This captures the usual list-level meaning of stability: equal elements appear in the same relative order after sorting. Bubble sort is stable because it swaps only when the next element is strictly smaller, never when two elements are equal. Insertion sort is stable because it processes the input from right to left and inserts a new element before equal elements already in the sorted tail. Counting sort is stable for List ℕ because each value’s filtered subsequence is completely determined by how many copies of that value occur.
The randomized quicksort part adds a small RandomTimeM abstraction: a PMF over timed computations. I used this because randomized algorithms do not produce one timed result; they produce a distribution over possible timed runs. The quicksort implementation samples a pivot index uniformly, removes that pivot occurrence, and partitions the rest into values below, equal to, and above the pivot. The equal bucket is not recursively sorted, so duplicates are handled directly instead of relying on a distinctness assumption.
For time complexity, the deterministic algorithms prove their expected comparison bounds directly in TimeM. For randomized quicksort, this PR proves sortedness and permutation correctness for every supported run, a quadratic worst-case comparison bound for every supported run, and the corresponding expected-time bound. It also adds reusable expected-time algebra for RandomTimeM, including the monad law for ENNReal expectations and a formal one-step expected-time recurrence for the actual quicksort implementation. I included the standard harmonic O(n log n) bound for the all-distinct rank recurrence as the next bridge toward the sharper average-case theorem.
AI usage: I used GPT-5.5 as an assistant while working on this PR, mainly for help navigating longer Lean proofs, trying proof refactors using some of the more advanced automation tactics. I reviewed the generated suggestions, ran the relevant CSLib checks locally, and take responsibility for the final code. Let me know if something looks bad or unnecessary.
Concretely, this PR adds:
Validation run locally: