Multiset Combinations

Multiset Combinations: Formula-Based Solution

In Part 1 of this series, we presented a multiset and displayed all of the unique combinations on the console. We used an algorithm that worked like an odometer where the count of a given element served as the "turning point" to begin processing the next digit.

As promised, the code here presents a formula-based algorithm using "mod" and "div" to convert the "index" to a unique combination. The number of choices determines the counts represented by a given index value.

In part 3 of this series, we'll clean up and simplify the code even further.

Formula-Based Solution

The program below calculates the "unique combination" for a given index "i." The comments offer additional details.

# Unique Combinations - Multiset data
# 2022-06-18

set list_Items [list A B C D]
set list_Count [list 4 3 2 1]

# Calculate the total number of unique combinations
# Create a list containing the number of CHOICES per Item
# Create a list to manage the unique combination
# Note:  lappend creates a list if it does not exist.

# Version 2 (2022-06-18)
set choices 1
# Calculate the sum of all counts to be used for the field width of SUM
set sum_ALL 0

for {set i 0} {$i < [llength $list_Count]} {incr i} {
    # Get the count of items from list_Count, add 1 to the value and
    # append "number of choices" to list_Choices
    lappend list_Choices [expr { [lindex $list_Count $i] + 1}]
    #
    lappend list_Index 0
    # Calculate new value of choices
    set choices [expr { $choices * [lindex $list_Choices $i] }]
    set sum_ALL [expr { $sum_ALL + [lindex $list_Count $i] }]
}

puts "choices = $choices"
puts "list_Choices = $list_Choices"

#------------------------------------------------------------------#
# We use "i" to serve as our counter and convert it to a unique
# combination.
# Version 2 (2022-06-15)
#
# We'll print a "prettier" table this time.
set dash "-"
    puts -nonewline [format "%*s:  " [string length $choices] \
                                      [string repeat $dash [string length $choices]]]
    puts -nonewline [format "  %*s " [expr {[llength $list_Index]*2 - 1} ] \
                       [string repeat $dash [expr {[llength $list_Index]*2 - 1}]]]
    for {set m 0} { $m < [llength $list_Index] } { incr m } {
        puts -nonewline [format "  %-*s" [lindex $list_Count $m] \
                                        [string repeat $dash [lindex $list_Count $m]]]
    }
    puts -nonewline [format " | %*s:  " [string length $sum_ALL] \
                                         [string repeat $dash [string length $sum_ALL]]]
    puts [format "%*s | " $sum_ALL [string repeat $dash $sum_ALL]]

for {set i 0} {$i < $choices} {incr i} {
    # The first data set combination is ALL ZEROES

    # Convert the variable "i" to a unique combination.
    # Get the data set combination
    # The first element of the solution is the "mod" of the index "i."
    # list_Choices contains each item's number of choices (Count of elements + 1).

    # We use the "mod" and "div" commands to convert "i" into a combination
    # derived from the available choices we have for each letter.
    # The process is similar to discerning the digits in our base ten numbering
    # system to isolate the 1's, 10's, 100's, 1000's and so on.

    #---------< This is the core "engine" that is doing the work >---------#

    set div_Step $i
    # mod to get the "first" count
    lset list_Index 0 [expr {$i % [expr {[lindex $list_Count 0] + 1}]}]

    for {set j 1; set k 0} {$j < [llength $list_Items]} {incr k; incr j} {
        # div div_Step by the previous number of choices
        set div_Step       [expr {$div_Step / [lindex $list_Choices $k]}]
        # mod div_Step by the current number of choices
        lset list_Index $j [expr {$div_Step % [lindex $list_Choices $j]}]
    }

    #----------------------< Process the DATA Set  >-----------------------#
    # list_Index contains the unique combination data set.
    # Display the data set combination - Default is the console.
    # We could also create a file to save the information
    # format formatString ?arg arg ...?

    puts -nonewline "[format "%0*d:  %s" [string length $choices] $i  $list_Index]"
    for {set m 0} { $m < [llength $list_Index] } { incr m } {
        puts -nonewline "  [format %*s [lindex $list_Count $m] [string repeat [lindex $list_Items $m] [lindex $list_Index $m]]]"
    }

    # Process the data set
    # Calculate the sum of the counts
    set sum 0
    for {set m 0} { $m < [llength $list_Index] } { incr m } {
        set sum [expr {$sum + [lindex $list_Index $m]}]
    }

    puts "[format " | %*d:  %*s |" [string length $sum_ALL] $sum $sum_ALL [string repeat $dash $sum]]"
    #       ...
    #       ...
    # Process ends

    # Loop to Get the NEXT data set combination
}

puts "Program Terminated Normally!"

Code Output

choices = 120
list_Choices = 5 4 3 2
---:  -------  ----  ---  --  - | --:  ---------- | 
000:  0 0 0 0                   |  0:             |
001:  1 0 0 0     A             |  1:           - |
002:  2 0 0 0    AA             |  2:          -- |
003:  3 0 0 0   AAA             |  3:         --- |
004:  4 0 0 0  AAAA             |  4:        ---- |
005:  0 1 0 0          B        |  1:           - |
006:  1 1 0 0     A    B        |  2:          -- |
007:  2 1 0 0    AA    B        |  3:         --- |
008:  3 1 0 0   AAA    B        |  4:        ---- |
 .
 .
 .
114:  4 2 2 1  AAAA   BB  CC  D |  9:   --------- |
115:  0 3 2 1        BBB  CC  D |  6:      ------ |
116:  1 3 2 1     A  BBB  CC  D |  7:     ------- |
117:  2 3 2 1    AA  BBB  CC  D |  8:    -------- |
118:  3 3 2 1   AAA  BBB  CC  D |  9:   --------- |
119:  4 3 2 1  AAAA  BBB  CC  D | 10:  ---------- |

Program Terminated Normally!

Explanation

As noted in the comments, the core code is a relatively short series of tcl commands:

    #---------< This is the core "engine" that is doing the work >---------#

    set div_Step $i
    # mod to get the "first" count
    lset list_Index 0 [expr {$i % [expr {[lindex $list_Count 0] + 1}]}]

    for {set j 1; set k 0} {$j < [llength $list_Items]} {incr k; incr j} {
        # div div_Step by the previous number of choices
        set div_Step       [expr {$div_Step / [lindex $list_Choices $k]}]
        # mod div_Step by the current number of choices
        lset list_Index $j [expr {$div_Step % [lindex $list_Choices $j]}]
    }

Note that we are performing a "mod" operation before entering the "for" loop, as we only need to know whether the first "count" in our unique combination is a factor of the number of choices for the first item.

We then initialize two counters, "j" and "k," to the values 1 and 0, respectively. We divide the index "i" by the number of choices for the first item and perform a "mod" operation using the number of choices for the following item in the list.

We can simplify the calculation further by using a single counter and reversing the "mod" and "div" steps as we'll see in part 3 of this series.

For the purpose of this example, we subject the unique combination to a simple process to display the result: 114: 4 2 2 1 AAAA BB CC D | 9: --------- |. The "process" lists the index, unique combination counts, the items (letters in this case), and the sum of the counts followed by an equivalent number of dashes.

The format statements are not as complicated as they look. The syntax for the format command is: format FormatString ?arg1 arg2 ... where FormatString contains the text and field specifiers for arg1, arg2 and so on. Refer to the Tcl format documentation for more details on the command syntax.

[format " | %*d: %*s |" field_width_d value_d field_width_s value_s]

A preceding "*" defers the field specifiers to the "args" following the format string.

Next Steps

Using our formula, we can derive any unique combination or count sequence for a given index value independent of any previous calculations or steps. in turn, we should now have the ability to derive a formula to calculate the index for a given unique combination.

We can also develop a separate procedure to return a unique combination for a given index such that we can focus on processes that require the unique combination.

See you in Part 3 ...

Multiset Combinations - Part 2