Algorithms

User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Algorithms

Post by Pigeon » Fri Nov 30, 2012 7:34 pm

A binary search or half-interval search algorithm finds the position of a specified value (the input "key") within a sorted array. In each step, the algorithm compares the input key value with the key value of the middle element of the array. If the keys match, then a matching element has been found so its index, or position, is returned. Otherwise, if the sought key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the input key is greater, on the sub-array to the right. If the remaining array to be searched is reduced to zero, then the key cannot be found in the array and a special "Not found" indication is returned.

A binary search halves the number of items to check with each iteration, so locating an item (or determining its absence) takes logarithmic time. A binary search is a dichotomic divide and conquer search algorithm.

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 7:37 pm

Divide and conquer (D&C) is an important algorithm design paradigm based on multi-branched recursion. A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same (or related) type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

This technique is the basis of efficient algorithms for all kinds of problems, such as sorting (e.g., quicksort, merge sort), multiplying large numbers (e.g. Karatsuba), syntactic analysis (e.g., top-down parsers), and computing the discrete Fourier transform (FFTs).

On the other hand, the ability to understand and design D&C algorithms is a skill that takes time to master. As when proving a theorem by induction, it is often necessary to replace the original problem by a more general or complicated problem in order to get the recursion going, and there is no systematic method for finding the proper generalization.

The name "divide and conquer" is sometimes applied also to algorithms that reduce each problem to only one subproblem, such as the binary search algorithm for finding a record in a sorted list (or its analog in numerical computing, the bisection algorithm for root finding). These algorithms can be implemented more efficiently than general divide-and-conquer algorithms; in particular, if they use tail recursion, they can be converted into simple loops. Under this broad definition, however, every algorithm that uses recursion or loops could be regarded as a "divide and conquer algorithm". Therefore, some authors consider that the name "divide and conquer" should be used only when each problem may generate two or more subproblems. The name decrease and conquer has been proposed instead for the single-subproblem class

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 7:46 pm

Merge sort (also commonly spelled mergesort) is an O(n log n) comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the implementation preserves the input order of equal elements in the sorted output. Merge sort is a divide and conquer algorithm that was invented by John von Neumann in 1945. A detailed description and analysis of bottom-up mergesort appeared in a report by Goldstine and Neumann as early as 1948.

Conceptually, a merge sort works as follows
  • Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
  • Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining. This will be the sorted list.

    Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 8:05 pm

Quicksort is a sorting algorithm developed by Tony Hoare that, on average, makes O(n log n) comparisons to sort n items. It is also known as partition-exchange sort. In the worst case, it makes O(n2) comparisons, though this behavior is rare.

Quicksort is often faster in practice than other O(n log n) algorithms. Additionally, quicksort's sequential and localized memory references work well with a cache. Quicksort can be implemented with an in-place partitioning algorithm, so the entire sort can be done with only O(log n) additional space.

Quicksort is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists.

The steps are:
  • Pick an element, called a pivot, from the list.
  • Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
  • Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 8:21 pm

Bubble sort, sometimes incorrectly referred to as sinking sort, is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted.

The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort. Although the algorithm is simple, most of the other sorting algorithms are more efficient for large lists.

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 8:32 pm

Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort.

However, insertion sort provides several advantages:
  • Simple implementation
  • Efficient for (quite) small data sets
  • Adaptive (i.e., efficient) for data sets that are already substantially sorted: the time complexity is O(n + d), where d is the number of inversions
  • More efficient in practice than most other simple quadratic (i.e., O(n2)) algorithms such as selection sort or bubble sort; the best case (nearly sorted input) is O(n)
  • Stable; i.e., does not change the relative order of elements with equal keys
  • In-place; i.e., only requires a constant amount O(1) of additional memory space
  • Online; i.e., can sort a list as it receives it
When humans manually sort something (for example, a deck of playing cards), most use a method that is similar to insertion sort

Every repetition of insertion sort removes an element from the input data, inserting it into the correct position in the already-sorted list, until no input elements remain. The choice of which element to remove from the input is arbitrary, and can be made using almost any choice algorithm.

Sorting is typically done in-place. The resulting array after k iterations has the property where the first k + 1 entries are sorted. In each iteration the first remaining entry of the input is removed, inserted into the result at the correct position, thus extending the result:

The most common variant of insertion sort, which operates on arrays, can be described as follows:

Suppose there exists a function called Insert designed to insert a value into a sorted sequence at the beginning of an array. It operates by beginning at the end of the sequence and shifting each element one place to the right until a suitable position is found for the new element. The function has the side effect of overwriting the value stored immediately after the sorted sequence in the array.

To perform an insertion sort, begin at the left-most element of the array and invoke Insert to insert each element encountered into its correct position. The ordered sequence into which the element is inserted is stored at the beginning of the array in the set of indices already examined. Each insertion overwrites a single value: the value being inserted.

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 8:47 pm

Pigeonhole sorting, also known as count sort (not to be confused with counting sort), is a sorting algorithm that is suitable for sorting lists of elements where the number of elements (n) and the number of possible key values (N) are approximately the same. It requires O(n + N) time.

The pigeonhole algorithm works as follows:
  • Given an array of values to be sorted, set up an auxiliary array of initially empty "pigeonholes," one pigeonhole for each key through the range of the original array.
  • Going over the original array, put each value into the pigeonhole corresponding to its key, such that each pigeonhole eventually contains a list of all values with that key.
  • Iterate over the pigeonhole array in order, and put elements from non-empty pigeonholes back into the original array.
For example, suppose we were sorting these value pairs by their first element:

(5, "hello")
(3, "pie")
(8, "apple")
(5, "king")

For each value between 3 and 8 we set up a pigeonhole, then move each element to its pigeonhole:

3: (3, "pie")
4:
5: (5, "hello"), (5, "king")
6:
7:
8: (8, "apple")

We then iterate over the pigeonhole array in order and move them back to the original list.

The difference between pigeonhole sort and counting sort is that in counting sort, the auxiliary array does not contain lists of input elements, only counts:

3: 1
4: 0
5: 2
6: 0
7: 0
8: 1

Using this information we can perform a series of exchanges on the input array that puts it in order, moving items only once. Pigeonhole sort, in contrast, moves items twice: once onto the pigeonhole/bucket array and again onto the destination array.

For arrays where N is much larger than n, bucket sort is a generalization that is more efficient in space and time.

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 8:52 pm

Bucket sort, or bin sort, is a sorting algorithm that works by partitioning an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. It is a distribution sort, and is a cousin of radix sort in the most to least significant digit flavour. Bucket sort is a generalization of pigeonhole sort. Since bucket sort is not a comparison sort, the Ω(n log n) lower bound is inapplicable. The computational complexity estimates involve the number of buckets.

Bucket sort works as follows:
  • Set up an array of initially empty "buckets."
  • Scatter: Go over the original array, putting each object in its bucket.
  • Sort each non-empty bucket.
  • Gather: Visit the buckets in order and put all elements back into the original array.
List


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 9:11 pm

radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. A positional notation is required, but because integers can represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers. Radix sort dates back as far as 1887 to the work of Herman Hollerith on tabulating machines.

Most digital computers internally represent all of their data as electronic representations of binary numbers, so processing the digits of integer representations by groups of binary digit representations is most convenient. Two classifications of radix sorts are least significant digit (LSD) radix sorts and most significant digit (MSD) radix sorts. LSD radix sorts process the integer representations starting from the least digit and move towards the most significant digit. MSD radix sorts work the other way around.

The integer representations that are processed by sorting algorithms are often called "keys", which can exist all by themselves or be associated with other data. LSD radix sorts typically use the following sorting order: short keys come before longer keys, and keys of the same length are sorted lexicographically. This coincides with the normal order of integer representations, such as the sequence 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. MSD radix sorts use lexicographic order, which is suitable for sorting strings, such as words, or fixed-length integer representations. A sequence such as "b, c, d, e, f, g, h, i, j, ba" would be lexicographically sorted as "b, ba, c, d, e, f, g, h, i, j". If lexicographic ordering is used to sort variable-length integer representations, then the representations of the numbers from 1 to 10 would be output as 1, 10, 2, 3, 4, 5, 6, 7, 8, 9, as if the shorter keys were left-justified and padded on the right with blank characters to make the shorter keys as long as the longest key for the purpose of determining sorted order.

Original, unsorted list:

170, 45, 75, 90, 802, 24, 2, 66

Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]

170, 90, 802, 2, 24, 45, 75, 66

Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]

802, 2, 24, 45, 66, 170, 75, 90

Sorting by most significant digit (100s place) gives:

2, 24, 45, 66, 75, 90, 170, 802

It is important to realize that each of the above steps requires just a single pass over the data, since each item can be placed in its correct bucket without having to be compared with other items.

Some LSD radix sort implementations allocate space for buckets by first counting the number of keys that belong in each bucket before moving keys into those buckets. The number of times that each digit occurs is stored in an array.

Link


User avatar
Pigeon
Posts: 18055
Joined: Thu Mar 31, 2011 3:00 pm

Re: Algorithms

Post by Pigeon » Fri Nov 30, 2012 9:13 pm

A list of sorting algorithms at Wikipedia

Another page with sorting algorithms

List of algorithms

Dictionary of Algorithms and Data Structures at NIST

Post Reply