gosort

package module
v0.0.0-...-674b8cf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2020 License: BSD-2-Clause Imports: 5 Imported by: 0

README

Collection of Go sorting algorithms and experiments.

Plots are log-log on both axes. My first crude TimSort1 implementation does surprisingly well - the refined TimSort2 is better still.

ns

Notes

The original goal of this project was to implement minimal versions of timsort and pdqsort and validate their performance improvement over standard mergesort and quicksort. It includes many search algorithms copied or adapted from Go's pkg/sort.

Many implementations are for []int, but some are for sort.Interface to enable quicksort killer adversary testing.

Lessons learned:

  1. sort.Sort in pkg/sort (QuickSort2 in this package) is introsort
  2. sort.Stable in pkg/sort (MergeSort2 in this package) is block-insertion-sort-plus-merge - not very dissimilar to TimSort, but with fixed-size blocks instead of different-sized already-ordered runs
  3. Implementing a basic form of TimSort without galloping or magic constants was actually rather intuitive and led to good performance gains by leveraging the symMerge (in-place merge) routine already implemented for sort.Stable in pkg/sort
  4. The Block Quicksort modification of pdqsort gives worse performance in Go

Here's a conclusion from a different go-pdqsort implementation that might explain it:

From the result the pattern defeating sort is much slower than the sort from standrad library. It is expected since go's memory layout is mainly heap managed and it is very different from what's in C++ and Rust. Therefore the key invention from pdqsort to reduce the cpu branch prediction is not impactful on speed, and slowing down due to more memory access.

- https://github.com/MnO2/go-pdqsort

So far the notable implementations of pdqsort have been the original in C++ and the Rust implementation.

  1. The bad partition/pattern busting part of pdqsort has a good performance boost over a standard CLRS quicksort with bad partition selection
sevagh:go-sort $ go test -benchmem -run=^a -bench='.*(QuickSort1|PdqSort1).*Random(8192|65536)$' -v
goos: linux
goarch: amd64
pkg: github.com/sevagh/go-sort
BenchmarkPdqSort1Random8192-8               3015            383876 ns/op               0 B/op          0 allocs/op
BenchmarkPdqSort1Random65536-8               121           9831749 ns/op               0 B/op          0 allocs/op
BenchmarkQuickSort1Random8192-8              100          25272420 ns/op               0 B/op          0 allocs/op
BenchmarkQuickSort1Random65536-8             100        1673251259 ns/op               0 B/op          0 allocs/op
PASS
ok      github.com/sevagh/go-sort       173.248s
  1. Overall pdqsort is more difficult to implement than timsort
  2. The embedded bigO library for least squares fitting BigO estimation is a good method of validating or verifying any of the implemented algorithms:
sevagh:go-sort $ go test -v -count=1 -run='BigO'
=== RUN   TestBigOTimSort
--- PASS: TestBigOTimSort (4.32s)
    bigo_test.go:27: O(Nlg(N))
=== RUN   TestBigOQuickSort1WorstCase
--- PASS: TestBigOQuickSort1WorstCase (0.00s)
    bigo_test.go:46: O(N^2)
=== RUN   TestBigOInsertionSortAscending
--- PASS: TestBigOInsertionSortAscending (0.00s)
    bigo_test.go:65: O(N^2)
=== RUN   TestBigOInsertionSortRandom
--- PASS: TestBigOInsertionSortRandom (0.00s)
    bigo_test.go:84: O(Nlg(N))
PASS
ok      github.com/sevagh/go-sort       4.328s

E.g. showing that the pattern-busting of PdqSort1 (that's essentially only just bolted onto QuickSort1) reduces the worst case O(N^2):

sevagh:go-sort $ go test -v -count=1 -run='WorstCase'
=== RUN   TestBigOQuickSort1WorstCase
--- PASS: TestBigOQuickSort1WorstCase (0.00s)
    bigo_test.go:46: O(N^2)
=== RUN   TestBigOPdqSort1WorstCaseQuickSort
--- PASS: TestBigOPdqSort1WorstCaseQuickSort (0.00s)
    bigo_test.go:103: O(N)
PASS
ok      github.com/sevagh/go-sort       0.002s
  1. I learned of the killer adversary for quicksort, which is implemented in sort_test.go in the Golang stdlib pkg/sort (and copied here in adversary_test.go). Verification with a handful of sort.Interface-based versions of QuickSort and PdqSort:
sevagh:go-sort $ go test -v -run 'Adversary'
=== RUN   TestQuickSortIface1Adversary
--- FAIL: TestQuickSortIface1Adversary (0.00s)
    adversary_test.go:24: used 560000 comparisons sorting adversary data with size 10000
=== RUN   TestPdqSortIface1Adversary
--- FAIL: TestPdqSortIface1Adversary (0.00s)
    adversary_test.go:24: used 560000 comparisons sorting adversary data with size 10000
=== RUN   TestAdversaryStdSort
--- PASS: TestAdversaryStdSort (0.00s)
FAIL
exit status 1
FAIL    github.com/sevagh/go-sort       0.011s

Here we see sort.Sort, which is introsort (QuickSort2 in this project), does well. QuickSort1 is obviously vulnerable. PdqSort1 with its bad partition busting is still not clever enough to escape the killer adversary.

Given the direct []int implementation of all the algorithms here, the caller has no control on the implementation of the sort interface (namely the Less method), so the killer adversary is irrelevant.

Documentation

Overview

Package gosort implements several sorting algorithms and experiments.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BitonicSortNaive

func BitonicSortNaive(nums *[]int)

BitonicSortNaive is my handwritten version of the CLRS bitonic sorting network with goroutines and out-of-place mergesort default ascending the real full sort of regular integers need to take a slice cause of sentinel padding

func BlockQuickSort1

func BlockQuickSort1(nums []int)

BlockQuickSort1 implements CLRS quicksort with bad pivot and block partitioning [1]

[1]: S. Edelkamp, A. Weiß, "BlockQuicksort: Avoiding Branch Mispredictions in Quicksort" https://pdfs.semanticscholar.org/b24e/f8021811cd4ef0fcc96a770657b664ee5b52.pdf

func BlockQuickSort2

func BlockQuickSort2(nums []int)

BlockQuickSort2 implements CLRS quicksort with randomized pivot and block partitioning [1]

[1]: S. Edelkamp, A. Weiß, "BlockQuicksort: Avoiding Branch Mispredictions in Quicksort" https://pdfs.semanticscholar.org/b24e/f8021811cd4ef0fcc96a770657b664ee5b52.pdf

func BlockQuickSortControl1

func BlockQuickSortControl1(nums []int)

BlockQuickSortControl1 is almost identical to BlockQuickSort1 with blocks removed, for fair bench comparisons

func BlockQuickSortControl2

func BlockQuickSortControl2(nums []int)

BlockQuickSortControl2 is almost identical to BlockQuickSort2 with blocks removed, for fair bench comparisons

func HeapSort

func HeapSort(nums []int)

HeapSort is copied from pkg/sort

func InsertionSort

func InsertionSort(nums []int)

InsertionSort is copied from pkg/sort

func MergeSort1

func MergeSort1(nums []int)

MergeSort1 implements CLRS recursive mergesort

func MergeSort2

func MergeSort2(nums []int)

MergeSort2 is pkg/sort's mergesort

it does insertion sort in blocks and merges the blocks

it looks a _little bit_ like timsort

func MergeSort3

func MergeSort3(nums []int)

MergeSort3 implements G&T iterative bottom up mergesort

func PdqSort1

func PdqSort1(nums []int)

PdqSort1 implements CLRS bad pivot quicksort with pdq bad pivot pattern breaking

Bad-pattern busting is copied from the Rust pdqsort implementation, https://docs.rs/pdqsort/0.1.0/src/pdqsort/lib.rs.html#427

func PdqSort2

func PdqSort2(nums []int)

PdqSort2 modifies pkg/sort's introsort quicksort with pdq bad pivot pattern breaking

func PdqSort3

func PdqSort3(nums []int)

PdqSort3 implements CLRS randomized pivot quicksort with pdq bad pivot pattern breaking

func PdqSortIface1

func PdqSortIface1(data sort.Interface)

PdqSortIface1 implements CLRS bad pivot quicksort with pdq bad pivot pattern breaking on a sort.Interface

This is to verify the effect of pattern-busting on the killer adversary

func QuickSort1

func QuickSort1(nums []int)

QuickSort1 implements CLRS quicksort with basic pivot selection

func QuickSort2

func QuickSort2(nums []int)

QuickSort2 is copied from pkg/sort - looks like introsort

func QuickSort3

func QuickSort3(nums []int)

QuickSort3 implements CLRS quicksort with randomized pivot selection

func QuickSortIface1

func QuickSortIface1(data sort.Interface)

QuickSortIface1 implements CLRS quicksort with basic pivot selection

using the sorting.Interface

func RadixSort

func RadixSort(nums []int)

RadixSort is based on a radix sort + counting sort for decimal digits from CLRS

func TimSort2

func TimSort2(nums []int)

TimSort2 is an optimized version of TimSort1

Types

This section is empty.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL