Saturday, July 26, 2014

A program to perform a two-pass sort on an Inform/RetrieVe statement:


The last several months have been a very busy time for me (conferences, vacations, etc.) and I have not been able to post new content until now.

If you have worked with native MVDBMS queries for very long then the problem of generating columnar output that is sorted by a calculated on-the-fly column subtotal should be well known. The standard sorting mechanism will sort on a column (field) very nicely. However, adding TOTAL (or CALC, EVAL, etc.) and DET.SUP keywords to the query will result in immediate output that will have no way of knowing the "place" or order of the subtotal value within the entire list output. The reason for this is because the query engine simply selects the records by the designated sort criteria and then tallies the subtotals without regard to the actual subtotal value. This is handy when the list MUST be sorted by the field specified. But what if there is a need to display the information in SUBTOTAL order?

For example:

>SELECT TASKS WITH DATE GE "01/01/2014"

12729 record(s) selected to SELECT list #0.
>>LIST TASKS BY TYPE BREAK.ON TYPE TOTAL CNT DET.SUP
TYPE             #.......
                       12
100                    20
111                    64
113                    38
114                     1
118                     2
130                     3
131                    31
132                     4
140                     1
141                     1
142                     4
143                     1
150                     7
151                     8
154                     6
155                     2
200                     6
210                     1
...
745                   232
746                    22
800                     1
812                     2
900                    16
911                     4
                 ========
                    12729



This sort of list is typical of sorted lists. The output is sorted by the TYPE field, with no regard to the subtotal values.

Because of this behavior, generating the list in subtotal order requires a two-pass operation. In the past I have generated a list to the &HOLD& file and then downloaded that file to my PC, to then sort in a spreadsheet program like Microsoft Excel.

Recently I wrote a program called XSORT (which means eXtra SORT) to perform the query and display (or print) the output in subtotal order (or reverse subtotal order).

The program looks like this:

Note: All programming displayed on this site is for illustrative purposes only. Use at your own risk.



PROGRAM XSORT

*
* This program runs an Inform/RetrieVe statement and resorts the output
* in ascending or descending order (as long as the last column is numeric).
*
* Command line options:
*   REVERSE, REV or REVERSE-SORT : Sort the list in descending order
*   MINIMUM=nn or MIN=nn.nn      : Do not display values below a threshold

*

equ true  to 1
equ false to 0
equ subtot.fmt lit '"7R"'

sentence = @SENTENCE
sentence.original = sentence

convert " " to @FM in sentence
if sentence<1> = "RUN" then
   sentence = delete(sentence,1,0,0) ;* remove RUN
   sentence = delete(sentence,1,0,0) ;* remove object file name
end
sentence = delete(sentence,1,0,0) ;* remove verb

reverse.sort = false
minimum.threshold = 0
last.column.val = 0
load.seq = false
debugging = false
new.sentence = ""
loop
   remove token from sentence setting more
   begin case
      case not(len(token)) ;* do nothing
      case token = "REVERSE" or token = "REV" or token = "'REVERSE'1X'SORT'"
         reverse.sort = true
      case token matches "'MIN'0A'='1N0N0X"
         minimum.threshold = field(token,"=",2)
         if not(minimum.threshold >= 0 and minimum.threshold <= 9999999.99) then
            print "Syntax MINIMUM=nnnn (where nnnn is a real or integer number)"
            stop
         end
      case token = "DEBUG" or token = "DEBUGGING"
         debugging = true
      case true
         new.sentence<-1> = token
   end case
   until not(more)
repeat

if sentence <> new.sentence then
   sentence = new.sentence
end

convert @FM to " " in sentence

datetime.executed = oconv(date(),"D4/"):" ":oconv(time(),"MTS")
if debugging then print sentence
execute sentence capturing output

previous.vals  = ""
previous.lines = ""

found.first.line.of.data = false
found.first.empty.line = false
header = ""
next.to.last.row = ""
total.line = ""
all.other.L.col = 0
all.other.NTL.col = 0
NTL.col.is.numeric = 0
NTL.col.is.not.numeric = 0
num.lines = dcount(output,@FM)
loop
   remove line from output setting more
   if len(line) then
      last.column.val = field(trim(line)," ",dcount(trim(line)," "),1)
      next.to.last.column.val = field(trim(line)," ",dcount(trim(line)," ")-1,1)

      if last.column.val <> convert("=","",last.column.val) and (convert("=","",trim(last.column.val)) = "") then
         next.to.last.row = line
         remove total.line from output setting more
      end else
         if (last.column.val < 0 or last.column.val > 99999999) or next.to.last.column.val = "PAGE" then
            if not(found.first.line.of.data) then
               header<-1> = line
            end
         end else
            if found.first.empty.line then
               if not(found.first.line.of.data) then
                  found.first.line.of.data = true
               end
               if num(last.column.val) then
                  last.column.val= last.column.val+0
               end
               if last.column.val > minimum.threshold then
                  if reverse.sort then
                     locate last.column.val in previous.vals<1> by "DR" setting POS else null
                  end else locate last.column.val in previous.vals<1> by "AR" setting POS else null

                  previous.vals  = insert(previous.vals ,POS,0,0,last.column.val)
                  previous.lines = insert(previous.lines,POS,0,0,line)
                  previous.vals  = previous.vals
                  previous.lines = previous.lines
               end else
                  all.other.L.col += last.column.val
                  if num(next.to.last.column.val) then
                     all.other.NTL.col += next.to.last.column.val
                     NTL.col.is.numeric += 1
                  end else NTL.col.is.not.numeric += 1
               end
            end
         end
      end
   end else found.first.empty.line = true
   until not(more)
repeat

loop
   remove line from header setting more
   print line
   until not(more)
repeat
print
loop
   remove line from previous.lines setting more
   print line
   if not(more) then
      if all.other.L.col > 0 then
         offset = len(next.to.last.row)-len(trim(next.to.last.row))-14
         if NTL.col.is.not.numeric > NTL.col.is.numeric then
            *
            * The next to last column is not consistently numeric...
            *
            all.other.NTL.col = 0 ;* override the logic that displays NTL col
            offset += 2
         end
         pln = "All Other ":space(offset)
         if all.other.NTL.col > 0 then
            pln:= fmt(all.other.NTL.col,subtot.fmt):"   "
         end
         pln:= fmt(all.other.L.col,subtot.fmt)
         print pln
      end
      print next.to.last.row
      print total.line
   end
   until not(more)
repeat

end




Running the exact same query but with the new XSORT verb preceding the query (along with the "REVERSE" option) generates the identical values but now sorted in subtotal order:

>SELECT TASKS WITH DATE GE "01/01/2014"

12731 record(s) selected to SELECT list #0.
>>XSORT REVERSE LIST TASKS BY TYPE BREAK.ON TYPE TOTAL CNT DET.SUP
TYPE             #.......
321                  8468
300                   594
600                   536
320                   445
743                   317
745                   232
700                   180
322                   136
500                   133
740                   121
733                   121
510                   115
735                   107
744                    95
400                    76
522                    67
111                    64
520                    56
531                    49
730                    47
671                    38
...
442                     2
350                     2
331                     2
155                     2
118                     2
800                     1
712                     1
632                     1
512                     1
420                     1
356                     1
352                     1
351                     1
342                     1
341                     1
231                     1
213                     1
211                     1
210                     1
143                     1
141                     1
140                     1
114                     1
                 ========
                    12731




Note: The only requirement is that the last column (or last few columns) be numeric.

No comments:

Post a Comment