feat(algorithms): add bubble, insertion, counting, and quicksort by Robertboy18 · Pull Request #570 · leanprover/cslib

Robertboy18 · 2026-05-17T19:57:20Z

I wanted to contribute more of the standard sorting algorithms alongside the existing MergeSort development. This PR adds stable bubble sort, stable insertion sort, counting sort for natural-number lists, and a duplicate-safe randomized quicksort, all using the existing TimeM style so the algorithms come with executable definitions, correctness theorems, and comparison-cost bounds.

For stability, I added a shared StableByValue predicate. The idea is that if we filter the input and output by any fixed value, we should get the same subsequence. This captures the usual list-level meaning of stability: equal elements appear in the same relative order after sorting. Bubble sort is stable because it swaps only when the next element is strictly smaller, never when two elements are equal. Insertion sort is stable because it processes the input from right to left and inserts a new element before equal elements already in the sorted tail. Counting sort is stable for List ℕ because each value’s filtered subsequence is completely determined by how many copies of that value occur.

The randomized quicksort part adds a small RandomTimeM abstraction: a PMF over timed computations. I used this because randomized algorithms do not produce one timed result; they produce a distribution over possible timed runs. The quicksort implementation samples a pivot index uniformly, removes that pivot occurrence, and partitions the rest into values below, equal to, and above the pivot. The equal bucket is not recursively sorted, so duplicates are handled directly instead of relying on a distinctness assumption.

For time complexity, the deterministic algorithms prove their expected comparison bounds directly in TimeM. For randomized quicksort, this PR proves sortedness and permutation correctness for every supported run, a quadratic worst-case comparison bound for every supported run, and the corresponding expected-time bound. It also adds reusable expected-time algebra for RandomTimeM, including the monad law for ENNReal expectations and a formal one-step expected-time recurrence for the actual quicksort implementation. I included the standard harmonic O(n log n) bound for the all-distinct rank recurrence as the next bridge toward the sharper average-case theorem.

AI usage: I used GPT-5.5 as an assistant while working on this PR, mainly for help navigating longer Lean proofs, trying proof refactors using some of the more advanced automation tactics. I reviewed the generated suggestions, ran the relevant CSLib checks locally, and take responsibility for the final code. Let me know if something looks bad or unnecessary.

Concretely, this PR adds:

StableByValue as a shared stability predicate for list sorting.
Stable bubble sort with sortedness, permutation, stability, and quadratic time proofs.
Stable insertion sort with sortedness, permutation, stability, and quadratic time proofs.
Counting sort on List ℕ with count preservation, sortedness, permutation, stability, and exact time proofs.
RandomTimeM, with expected-value and expected-time lemmas for randomized timed computations.
Randomized quicksort with PMF semantics, duplicate-safe three-way partitioning, correctness for every supported run, quadratic worst-case time, expected-time bounds, and an expected-time recurrence lemma.

Validation run locally:

lake build Cslib
lake test
lake lint
lake exe checkInitImports
lake exe lint-style
lake exe mk_all --module

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds foundational sorting infrastructure and several list-sorting algorithms (with correctness/stability and comparison-count cost models) to the Lean portion of the library, including support for randomized timed computations.

Changes:

Introduce StableByValue as a general stability predicate for list sorting results.
Add deterministic stable sorts (insertion sort, bubble sort), counting sort for ℕ, and randomized quicksort with expected-time results.
Add RandomTimeM utilities for expectations over PMF (TimeM ...), and re-export new modules from Cslib.lean.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
Cslib/Algorithms/Lean/Sorting.lean	Adds a stability predicate (`StableByValue`) used by stable sort proofs.
Cslib/Algorithms/Lean/RandomTimeM.lean	Introduces `RandomTimeM` plus expected value/time definitions and lemmas.
Cslib/Algorithms/Lean/QuickSort/QuickSort.lean	Adds randomized quicksort, correctness proofs, and worst/expected-time bounds.
Cslib/Algorithms/Lean/InsertionSort/InsertionSort.lean	Adds stable insertion sort with correctness/stability and time bounds.
Cslib/Algorithms/Lean/CountingSort/CountingSort.lean	Adds counting sort for `ℕ` with correctness/stability and exact time formula.
Cslib/Algorithms/Lean/BubbleSort/BubbleSort.lean	Adds stable bubble sort with correctness/stability and time bounds.
Cslib.lean	Re-exports the new sorting and randomized-time modules from the umbrella module.

+pointwise worst-case theorem.
+-/
+theorem randomizedQuickSort_expectedTime_le_quadratic (xs : List α) :
+    RandomTimeM.expectedTime (randomizedQuickSort xs) ≤ (2 * xs.length * xs.length : ℕ) := by


+      (base : ENNReal) + p.expectedTimeENNReal := by
+  rw [expectedTimeENNReal_bind]
+  simp only [expectedTimeENNReal, expectedNat_pure]
+  rw [← expectedNat]


+    ⟨sortedPrefix.ret ++ [pass.ret.2], pass.time + sortedPrefix.time⟩
+termination_by xs => xs.length
+decreasing_by
+  simp_wf


+/-- Expected running time is the weighted average of the time component of each result. -/
+noncomputable def expectedTime (p : RandomTimeM ℕ α) : ℝ :=
+  expectedValue p fun result => result.time
+
+/-- Expected running time as an extended nonnegative real. -/
+noncomputable def expectedTimeENNReal (p : RandomTimeM ℕ α) : ENNReal :=
+  expectedNat p fun result => result.time
+
+/-- The real and `ENNReal` expected-time views agree after converting back to `ℝ`. -/
+theorem expectedTime_eq_toReal_expectedTimeENNReal (p : RandomTimeM ℕ α) :
+    p.expectedTime = p.expectedTimeENNReal.toReal := by
+  unfold expectedTime expectedTimeENNReal expectedValue expectedNat
+  rw [ENNReal.tsum_toReal_eq]
+  · simp [ENNReal.toReal_mul]
+  · intro result
+    exact ENNReal.mul_ne_top
+      (ne_of_lt ((p.coe_le_one result).trans_lt ENNReal.one_lt_top))
+      (by exact_mod_cast (WithTop.coe_ne_top : (result.time : ENNReal) ≠ ⊤))


+namespace Cslib.Algorithms.Lean
+
+/-- `ys` preserves the order of equal values from `xs`. -/
+abbrev StableByValue {α : Type} [DecidableEq α] (xs ys : List α) : Prop :=


Robertboy18 added 3 commits May 17, 2026 18:26

Add timed sorting algorithms

fe0303e

Strengthen randomized quicksort expected-time proofs

63ebbea

Clean sorting modules for CSLib style

7985773

Copilot AI review requested due to automatic review settings May 17, 2026 19:57

Robertboy18 requested review from chenson2018 and fmontesi as code owners May 17, 2026 19:57

Copilot AI reviewed May 17, 2026

View reviewed changes

Address Copilot sorting review comments

3a07631

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(algorithms): add bubble, insertion, counting, and quicksort#570

feat(algorithms): add bubble, insertion, counting, and quicksort#570
Robertboy18 wants to merge 4 commits into
leanprover:mainfrom
Robertboy18:algorithms-sorting-timem

Robertboy18 commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Robertboy18 commented May 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants