Reformatting text values:
This article describes how to use a simple I-type to format an existing D-type field with any number of multi-values containing text, into regularly sized values of text. While replicating data from UniVerse to Microsoft SQL Server I ran across some data that varied widely in its format with regards to the length of multi-valued text. The problem was evident when inserting the individual values as separate rows in the SQL Server database. The schema had been set to some arbitrary column width (for example VARCHAR(255)) which worked well for most of the records but for many the text length exceeded this limit and was truncated to fit the schema. One option would be to insert the contents of the field, including value-marks as a Binary Large OBject (BLOB) in its own column. This would make future queries difficult so a better option is to make the replicated data fit the schema.
Here is what the data looks like (I display some calculated values on-the-fly to help illustrate the problem):
Note: All programming displayed on this site is for illustrative purposes only. Use at your own risk.
>LIST EVENTS EVAL "DCOUNT(DESC,@VM)" COL.HDG "#vals" FMT "4R" EVAL "LEN(REUSE(DESC))" FMT "10R" COL.HDG "Len Desc" EVAL "CONVERT(@VM,'}',DESC)" COL.SPCS BY.DSND EVAL "LEN(REUSE(DESC))" COUNT.SUP
EVENTS #vals Len Desc CONVERT(@VM,'}',DESC)...................
130196 1 9998 While emailing from the tablet app the
full text was not completely visible at
all times. Instead I had to utilize the
vertical and horizontal scroll bars and
...
160283 18 1273 Choosing to print displays the print
dialog but the "Print" and "Cancel"
buttons aren't immediately visible. By
scrolling the app "up" the buttons are
visible.}Sometimes I can zoom in and
...
140119 17 792 Running the latest version the option to
automatically activate the map with my
current location isn't apparent.}After
changing views to map-view then by
clicking the use-current-location icon
will the map appear with my location.}
...
021092 2 66 There are periodic flashes that occur at
random times.}See attached.
...
181002 1 46 The font size of the field labels is too
small.
In the above example the list is sorted in descending order by the DESC field length. The largest field is 9,998 characters in length and the smallest field contains just 46 characters.
The problem that becomes apparent is that the replicated data, being split off into an associated table with foreign keys that link it to the base table row, is not formatted consistently. Obviously had the issue of varying lengths of values been anticipated then the entry window that initially captures the information could have been designed to always result in regularly sized values. With that said the most expedient solution is to replicate a new dictionary item in place of the "DESC" data field, one that "chunks" the output into regularly sized values that are replicated into individual rows. This will ensure all data conforms to the schema constraints and subsequent queries will return all matching rows.
The I-type used for this purpose is as follows:
>LIST DICT EVENTS "DESC" "DESCRIPTION"
Type &
Field......... Field. Field........................................................ Column......... Output Depth &
Name.......... Number Definition................................................... Heading........ Format Assoc..
DESC D 12 Description 40T M
DESCRIPTION I TRIM(CONVERT(@TM,@VM,FMT(TRIM(CONVERT(@VM,' ',DESC)),'80T'))) Description 80T M
2 records listed.
The original field is "DESC" which is a D-type item. The new field is "DESCRIPTION" which is an I-type item. The new item inserts value marks every 80th character. What follows are the same records shown above, but with the DESCRIPTION field alongside that, for the purposes of displaying what's happening, also converts the @VMs (value marks) to "}" just for the sake of visualizing them in the sample listing. The actual fields as rendered by RetrieVe appear as line breaks when listed.
>LIST EVENTS EVAL "CONVERT(@VM,'}',DESC)" EVAL "CONVERT(@VM,'}',DESCRIPTION)" COL.SPCS BY.DSND EVAL "LEN(REUSE(DESC))" COUNT.SUP
EVENTS CONVERT(@VM,'}',DESC)................... CONVERT(@VM,'}',DESC)...................
130196 While emailing from the tablet app the While emailing from the tablet app the
full text was not completely visible at full text was not completely visible at
all times. Instead I had to utilize the }all times. Instead I had to utilize the
vertical and horizontal scroll bars and vertical and horizontal scroll bars and}
...
160283 Choosing to print displays the print Choosing to print displays the print
dialog but the "Print" and "Cancel" dialog but the "Print" and "Cancel"
buttons aren't immediately visible. By }buttons aren't immediately visible. By
scrolling the app "up" the buttons are scrolling the app "up" the buttons are
visible.}Sometimes I can zoom in and }visible. Sometimes I can zoom in and
...
140119 Running the latest version the option to Running the latest version the option to
automatically activate the map with my automatically activate the map with my
current location isn't apparent.}After }current location isn't apparent. After
changing views to map-view then by changing views to map-view then by
clicking the use-current-location icon }clicking the use-current-location icon
will the map appear with my location.} will the map appear with my location.
...
021092 There are periodic flashes that occur There are periodic flashes that occur
at random times.}See attached. at random times. See attached.
...
181002 The font size of the field labels is too The font size of the field labels is too
small. small.
The result is that very large fields with none, one or a few value marks are broken up into discrete 80 character (or fewer) chunks. These segments will fit within a SQL Server table column with type VARCHAR(80). Any DESC fields that are not very large will result in a DESCRIPTION field that aggregates the values into one or a few value "chunks".
While purposefully entered line breaks (in the original field) will not always be preserved in the new aggregated field, the output does conform to a SQL Server schema without truncating anything.
No comments:
Post a Comment