Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick sort and related Arrays utilities #58

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
2e62624
adding Heap Sort and Bianry Search
Aug 11, 2022
00e8667
changes as suggestions
Aug 16, 2022
fd12d32
adding dependencies
Aug 16, 2022
d8f8cee
adding 'RUN' line
Aug 16, 2022
57e35ae
'RUN' line
Aug 16, 2022
ac104b2
chnages as suggested for verification
Aug 17, 2022
9af98ed
removing predicates and lemmas not needed,formatting.
Aug 17, 2022
aab9de9
adding copyrights info.
Aug 17, 2022
513d165
more changes as suggested to remove unused import and formatting.
prvshah51 Aug 18, 2022
f44ea6f
following suggestions
prvshah51 Aug 31, 2022
71fe474
changes file structure to make import and exports simple.
prvshah51 Sep 1, 2022
41faf37
only verification error is compare expecting bool and not CompResult.
prvshah51 Sep 1, 2022
33a43a5
more changes , updating local changes.
prvshah51 Sep 8, 2022
5c8f06f
forgot to add before.
prvshah51 Sep 8, 2022
7da871a
Delete Lexicographics.dfy
prvshah51 Sep 12, 2022
62b9346
Delete LexicographicHelpers.dfy
prvshah51 Sep 12, 2022
e8e5919
suggested changes , removed Lexicographics and LexicographicsHelper .
prvshah51 Sep 12, 2022
bb009fb
removed unnecessary exports, includes from more files.
prvshah51 Sep 12, 2022
6d57e55
Remove Comparison.dfy and revert to plain relations, general cleanup
robin-aws Sep 13, 2022
9038f2a
Move Relations.dfy to src (ala Functions.dfy)
robin-aws Sep 13, 2022
a2edede
Making example verification cheaper, fixing Wrappers.dfy lit test
robin-aws Sep 13, 2022
1df340f
Use OutputCheck instead of *.expect files
robin-aws Sep 13, 2022
850e3d4
Revert remaining unneeded edits
robin-aws Sep 13, 2022
34e380a
Newline
robin-aws Sep 13, 2022
730278a
Start to use /functionSyntax:4, don’t encourage compiling SortedBy
robin-aws Sep 15, 2022
f1f31ad
Revert adding /functionSyntax:4
robin-aws Sep 15, 2022
92a19bd
Adding Quick Sort and Arrays related utilities
prvshah51 Oct 21, 2022
bc018cb
edit to run tests again
prvshah51 Nov 11, 2022
ae832a4
using different synrtax for CI
prvshah51 Nov 14, 2022
a508b99
for another file
prvshah51 Nov 14, 2022
03af996
Apply suggestions from review for readability
prvshah51 Nov 17, 2022
467f587
improving readabilty as per suggestions
prvshah51 Dec 2, 2022
9688843
Merge branch 'QuickSort-prvshah' of https://github.com/prvshah51/libr…
prvshah51 Dec 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
374 changes: 374 additions & 0 deletions src/Collections/Arrays/Arrays.dfy
Original file line number Diff line number Diff line change
@@ -0,0 +1,374 @@
// RUN: %dafny /compile:0 "%s"

/*******************************************************************************
* Copyright by the contributors to the Dafny Project
* SPDX-License-Identifier: MIT
*******************************************************************************/

include "../Sets/Sets.dfy"
include "../../Comparison.dfy"

module Arrays{

import opened Sets
import opened Comparison
import opened Wrappers

trait Sorter<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely needs an example in examples to demonstrate using it. I assume you have to not only provide a concrete class that implements Sorter<T> and fills in Cmp, but perhaps even has to invoke a lemma or two from the trait in order to prove that it works.

Is it possible to also provide a PredicateSorter class that is constructed around a (T, T) -> Cmp function value? Perhaps it could even be a partial function if you also provide a ghost set of all values you intend to sort with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, I'd strongly prefer to have an example, but the PredicateSorter idea can wait for a future PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yikes, I went to write an example just to play around with it and immediately got the A class may only extend a trait in the same module, unless the parent trait is annotated with {:termination false} error. That means this is currently useless unless we add {:termination false}.

Unfortunately we should treat that as a hard blocker given there's already been resistance to putting {:termination false} on any standard library code: #22 And especially given the discovery of dafny-lang/dafny#2500 I'm not inclined to push back on that.

We can think about whether there's a good alternative for this code that doesn't use traits, but it might be better to stick with this if it's good UX and at least wait for the solution to dafny-lang/dafny#2500.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not right: the proper API to use is PredicateSorter, which is a regular class, defined right below Sorter. That doesn't require :termination false.

Original code here, with examples: https://github.com/dafny-lang/compiler-bootstrap/blob/4f616822209828e48cf63d3da66ee1c010f689d4/src/Utils/Lib.Sort.dfy#L396-L420

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cpitclaudel, I see the issue now is this PR doesn't include the PredicateSorter class. We should just add that as well.

ghost const C := Comparison(Cmp)

function method Cmp(t0: T, t1: T): Cmp

twostate predicate Identical(arr: array<T>, lo: int, hi: int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "identical" to mean "unchanged" really threw me for a bit. Consider UnchangedSlice instead, and likewise throughout.

requires 0 <= lo <= hi <= arr.Length
reads arr
{
arr[lo..hi] == old(arr[lo..hi])
}

twostate lemma IdenticalSplit(arr: array<T>, lo: int, mid: int, hi: int)
requires 0 <= lo <= mid <= hi <= arr.Length
requires Identical(arr, lo, hi)
ensures Identical(arr, lo, mid)
ensures Identical(arr, mid, hi)
{
assert arr[lo..mid] == arr[lo..hi][..mid-lo];
}

twostate lemma IdenticalShuffled(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
requires Identical(arr, lo, hi)
ensures Shuffled(arr, lo, hi)
{}

twostate predicate SameElements(arr: array<T>, lo: int, hi: int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider UnchangedExceptSliceShuffled instead.

reads arr
requires 0 <= lo <= hi <= arr.Length
{
&& Identical(arr, 0, lo)
&& Shuffled(arr, lo, hi)
&& Identical(arr, hi, arr.Length)
}

twostate lemma SameElementsExtend(arr: array<T>, lo: int, lo': int, hi': int, hi: int)
requires 0 <= lo <= lo' <= hi' <= hi <= arr.Length
requires SameElements(arr, lo', hi')
ensures SameElements(arr, lo, hi)
{
assert SameElements(arr, lo', hi');
assert Identical(arr, 0, lo') && Shuffled(arr, lo', hi') && Identical(arr, hi', arr.Length);
IdenticalSplit(arr, 0, lo, lo'); IdenticalSplit(arr, hi', hi, arr.Length);
assert && Identical(arr, 0, lo) && Identical(arr, lo, lo')
&& Shuffled(arr, lo', hi')
&& Identical(arr, hi', hi) && Identical(arr, hi, arr.Length);
IdenticalShuffled(arr, lo, lo'); IdenticalShuffled(arr, hi', hi);
assert && Identical(arr, 0, lo) && Shuffled(arr, lo, lo')
&& Shuffled(arr, lo', hi')
&& Shuffled(arr, hi', hi) && Identical(arr, hi, arr.Length);
ShuffledConcat(arr, lo, lo', hi'); ShuffledConcat(arr, lo, hi', hi);
assert Identical(arr, 0, lo) && Shuffled(arr, lo, hi) && Identical(arr, hi, arr.Length);
assert SameElements(arr, lo, hi);
}

twostate predicate Shuffled(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
reads arr
{
multiset(arr[lo..hi]) == old(multiset(arr[lo..hi]))
Copy link
Contributor Author

@prvshah51 prvshah51 Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember adding this to almost every every function method in MergeSort.dfy will need to replace that with shuffled instead.

}

predicate Sortable(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
reads arr
{
reveal C.Valid?(); // Leaks through
C.Valid?(Sets.OfSlice(arr, lo, hi))
}

twostate lemma SetOfSliceShuffled(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
requires Shuffled(arr, lo, hi)
ensures Sets.OfSlice(arr, lo, hi) == old(Sets.OfSlice(arr, lo, hi))
{
calc {
old(Sets.OfSlice(arr, lo, hi));
set x | x in old(arr[lo..hi]);
set x | x in old(multiset(arr[lo..hi]));
set x | x in multiset(arr[lo..hi]);
set x | x in arr[lo..hi];
Sets.OfSlice(arr, lo, hi);
}
}

twostate lemma SortableSameElements(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
requires old(Sortable(arr, lo, hi))
requires SameElements(arr, lo, hi)
ensures Sortable(arr, lo, hi)
{
SetOfSliceShuffled(arr, lo, hi);
}

twostate lemma ShuffledConcat(arr: array<T>, lo: int, mid: int, hi: int)
requires 0 <= lo <= mid <= hi <= arr.Length
requires Shuffled(arr, lo, mid)
requires Shuffled(arr, mid, hi)
ensures Shuffled(arr, lo, hi)
{
calc {
old(multiset(arr[lo..hi]));
{ assert old(arr[lo..hi] == arr[lo..mid] + arr[mid..hi]); }
old(multiset(arr[lo..mid] + arr[mid..hi]));
old(multiset(arr[lo..mid])) + old(multiset(arr[mid..hi]));
multiset(arr[lo..mid]) + multiset(arr[mid..hi]);
{ assert arr[lo..hi] == arr[lo..mid] + arr[mid..hi]; }
multiset(arr[lo..hi]);
}
}

lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lemma Sortable_Slice(arr: array<T>, lo: int, hi: int, lo': int, hi': int)
lemma SortableSlice(arr: array<T>, lo: int, hi: int, lo': int, hi': int)

Only instance of an underscore I see here and it really sticks out.

Also consider SortableSubslice instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SortableSlice or SortableSubslice both makes sense putting Subslice describes it better.

requires 0 <= lo <= lo' <= hi' <= hi <= arr.Length
requires Sortable(arr, lo, hi)
ensures Sortable(arr, lo', hi')
{}

predicate Sorted(arr: array<T>, lo: int, hi: int)
reads arr
requires 0 <= lo <= hi <= arr.Length
{
C.Sorted(arr[lo..hi])
}

lemma StripedInit(sq: seq<T>, pivot: T, lo: int, hi: int)
requires 0 <= lo <= hi <= |sq|
ensures C.Striped(sq, pivot, lo, lo, lo, hi, hi)
{
reveal C.Striped();
}

lemma StripedNonEmpty(sq: seq<T>, pivot: T, lo: int, left: int, mid: int, right: int, hi: int)
requires 0 <= lo <= left <= mid <= right <= hi <= |sq|
requires C.Striped(sq, pivot, lo, left, mid, right, hi)
requires C.Valid?(Sets.OfSeq(sq[lo..hi]))
requires pivot in sq[lo..hi]
ensures left < right
{
reveal C.Valid?();
var idx :| lo <= idx < hi && sq[idx] == pivot;
C.AlwaysReflexive(pivot);
reveal C.Striped();
}

twostate lemma StripedSameElements(arr: array<T>, pivot: T, lo: int, left: int, right: int, hi: int)
requires 0 <= lo <= left <= right <= hi <= |arr[..]|
requires Shuffled(arr, lo, left)
requires Identical(arr, left, right)
requires Shuffled(arr, right, hi)
requires old(C.Striped(arr[..], pivot, lo, left, right, right, hi))
ensures C.Striped(arr[..], pivot, lo, left, right, right, hi)
{
reveal C.Striped();
forall i | lo <= i < left ensures Cmp(arr[..][i], pivot).Lt? {
assert arr[..][i] == arr[lo..left][i - lo];
assert arr[..][i] in multiset(arr[lo..left]);
}
forall i | left <= i < right ensures Cmp(arr[..][i], pivot).Eq? {
assert arr[..][i] == arr[left..right][i - left];
}
forall i | right <= i < hi ensures Cmp(arr[..][i], pivot).Gt? {
assert arr[..][i] == arr[right..hi][i - right];
assert arr[..][i] in multiset(arr[right..hi]);
}
}

method Swap(arr: array<T>, ghost lo: int, i: int, j: int, ghost hi: int)
requires 0 <= lo <= i <= j < hi <= arr.Length
modifies arr
ensures SameElements(arr, lo, hi)
ensures arr[..] == old(arr[..][i := arr[j]][j := arr[i]])
{
arr[i], arr[j] := arr[j], arr[i];
}

method SwapLt(
arr: array<T>, pivot: T,
lo: int, left: int, mid: int, right: int, hi: int
)
requires 0 <= lo <= left <= mid < right <= hi <= arr.Length
requires C.Striped(arr[..], pivot, lo, left, mid, right, hi)
requires Cmp(arr[mid], pivot).Lt?
modifies arr
ensures SameElements(arr, lo, hi)
ensures arr[..] == old(arr[..][left := arr[mid]][mid := arr[left]])
ensures C.Striped(arr[..], pivot, lo, left + 1, mid + 1, right, hi)
{
reveal C.Striped();
Swap(arr, lo, left, mid, hi);
}

method SwapGt(
arr: array<T>, pivot: T,
lo: int, left: int, mid: int, right: int, hi: int
)
requires 0 <= lo <= left <= mid < right <= hi <= arr.Length
requires C.Striped(arr[..], pivot, lo, left, mid, right, hi)
requires Cmp(arr[mid], pivot).Gt?
modifies arr
ensures SameElements(arr, lo, hi)
ensures arr[..] == old(arr[..][mid := arr[right - 1]][right - 1 := arr[mid]])
ensures C.Striped(arr[..], pivot, lo, left, mid, right - 1, hi)
{
reveal C.Striped();
Swap(arr, lo, mid, right - 1, hi);
}

lemma SortedStripedMiddle(arr: array<T>, pivot: T, lo: int, left: int, right: int, hi: int)
requires 0 <= lo <= left < right <= hi <= arr.Length
requires Sortable(arr, lo, hi)
requires pivot in multiset(arr[lo..hi])
requires C.Striped(arr[..], pivot, lo, left, right, right, hi)
ensures Sorted(arr, left, right)
{
reveal C.Striped();
forall i, j | left <= i < j < right ensures Cmp(arr[i], arr[j]).Le? {
var idx :| lo <= idx < hi && arr[idx] == pivot;
assert Cmp(arr[i], pivot) == Eq;
assert C.Complete??(arr[j], pivot);
assert C.Transitive??(arr[i], pivot, arr[j]);
}
}

lemma SortedConcat(arr: array<T>, lo: int, mid: int, hi: int)
requires 0 <= lo <= mid <= hi <= arr.Length
requires Sorted(arr, lo, mid)
requires Sorted(arr, mid, hi)
requires forall i, j | lo <= i < mid <= j < hi :: Cmp(arr[i], arr[j]).Le?
ensures Sorted(arr, lo, hi)
{
assert arr[lo..mid] + arr[mid..hi] == arr[lo..hi];
C.SortedConcat(arr[lo..mid], arr[mid..hi]);
}

twostate lemma SortedIdentical(arr: array<T>, lo: int, hi: int)
requires 0 <= lo <= hi <= arr.Length
requires old(Sorted(arr, lo, hi))
requires Identical(arr, lo, hi)
ensures Sorted(arr, lo, hi)
{
assert arr[lo..hi] == old(arr[lo..hi]);
forall i, j | lo <= i < j < hi ensures Cmp(arr[i], arr[j]).Le? {
assert arr[i] == arr[lo..hi][i - lo];
assert arr[j] == arr[lo..hi][j - lo];
}
}

lemma SortedDutchFlag(arr: array<T>, pivot: T, lo: int, left: int, right: int, hi: int)
requires 0 <= lo <= left < right <= hi <= arr.Length
requires Sortable(arr, lo, hi)
requires pivot in multiset(arr[lo..hi])
requires Sorted(arr, lo, left)
requires Sorted(arr, right, hi)
requires C.Striped(arr[..], pivot, lo, left, right, right, hi)
ensures Sorted(arr, lo, hi)
{
reveal C.Striped();
forall i, j | lo <= i < left <= j < right ensures Cmp(arr[i], arr[j]).Le? {
var idx :| lo <= idx < hi && arr[idx] == pivot;
assert Cmp(arr[i], pivot).Le?;
assert Cmp(arr[j], pivot).Eq?;
assert C.Complete??(arr[j], pivot);
assert C.Transitive??(arr[i], pivot, arr[j]);
}
SortedStripedMiddle(arr, pivot, lo, left, right, hi);
SortedConcat(arr, lo, left, right);
forall i, j | lo <= i < right <= j < hi ensures Cmp(arr[i], arr[j]).Le? {
assert Cmp(arr[i], pivot).Le?;
assert Cmp(arr[j], pivot).Gt?;
assert C.Complete??(arr[i], pivot);
assert C.Complete??(arr[j], pivot);
assert C.Transitive??(arr[i], pivot, arr[j]);
}
SortedConcat(arr, lo, right, hi);
}

method QuickSort(arr: array<T>, lo: int := 0, hi: int := arr.Length)
requires 0 <= lo <= hi <= arr.Length
requires Sortable(arr, lo, hi)
decreases hi - lo
modifies arr
ensures Sortable(arr, lo, hi)
ensures Sorted(arr, lo, hi)
ensures SameElements(arr, lo, hi)
{
if hi - lo > 1 {
var pivot := MedianOfMedians(arr, lo, hi);
var left, right := DutchFlag(arr, pivot, lo, hi);

label left:
SortableSameElements(arr, lo, hi);
Sortable_Slice(arr, lo, hi, lo, left);
StripedNonEmpty(arr[..], pivot, lo, left, right, right, hi);
QuickSort(arr, lo, left);

label right:
SameElementsExtend@left(arr, lo, lo, left, hi);
SortableSameElements(arr, lo, hi);
Sortable_Slice(arr, lo, hi, right, hi);
IdenticalSplit@left(arr, left, hi, arr.Length);
IdenticalSplit@left(arr, left, right, hi);
IdenticalShuffled@left(arr, right, hi);
StripedSameElements@left(arr, pivot, lo, left, right, hi);
QuickSort(arr, right, hi);

SameElementsExtend@right(arr, lo, right, hi, hi);
SortableSameElements(arr, lo, hi);
IdenticalSplit@right(arr, lo, left, right);
SortedIdentical@right(arr, lo, left);
StripedSameElements@right(arr, pivot, lo, left, right, hi);
SortedDutchFlag(arr, pivot, lo, left, right, hi);
}
}

function method MedianOfMedians(arr: array<T>, lo: int, hi: int): (t: T)
requires 0 <= lo < hi <= arr.Length
reads arr
ensures t in arr[lo..hi]
{
arr[lo] // TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either implement this or drop the function and just inline this choice of pivot above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement it — it's not hard and it's an important part of making the algorithm safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per offline conversation we decided not to block on this, but in that case we should add a comment on QuickSort that it doesn't yet have the intended O(n log n) worst case runtime yet (which is what @cpitclaudel was referring to as "safe").

}

method DutchFlag(arr: array<T>, pivot: T, lo: int, hi: int) returns (left: int, right: int)
requires 0 <= lo < hi <= arr.Length
requires pivot in multiset(arr[lo..hi])
modifies arr
ensures SameElements(arr, lo, hi)
ensures lo <= left <= right <= hi
ensures C.Striped(arr[..], pivot, lo, left, right, right, hi)
{
left, right := lo, hi;
var mid := lo;
StripedInit(arr[..], pivot, lo, hi);

while mid < right
invariant pivot in multiset(arr[lo..hi])
invariant lo <= left <= mid <= right <= hi
invariant SameElements(arr, lo, hi)
invariant C.Striped(arr[..], pivot, lo, left, mid, right, hi)
{
reveal C.Striped();
match Cmp(arr[mid], pivot) {
case Lt =>
SwapLt(arr, pivot, lo, left, mid, right, hi);
left := left + 1;
mid := mid + 1;
case Gt =>
SwapGt(arr, pivot, lo, left, mid, right, hi);
right := right - 1;
case Eq =>
mid := mid + 1;
}
}

}
}
}
13 changes: 12 additions & 1 deletion src/Collections/Sets/Sets.dfy
Original file line number Diff line number Diff line change
Expand Up @@ -193,4 +193,15 @@ module Sets {
LemmaSubsetSize(x, range);
}

}
function method OfSeq<T(==)>(sq: seq<T>): set<T> {
set x | x in sq
}

function method OfSlice<T(==)>(arr: array<T>, lo: int, hi: int): set<T>
requires 0 <= lo <= hi <= arr.Length
reads arr
{
OfSeq(arr[lo..hi])
}

}
Loading