Recently I have had the opportunity to delve into UniVerse semaphores. While troubleshooting unusual system hangs on one server we found that when programs (including subroutines and functions) exit, any dynamic file variables (handles) that were opened and not in a common will cause UniVerse to lock a semaphore to check for splitting or merging groups. This behavior is not new and has never been a problem before. However since upgrading from UniVerse 10.3 to 11.1.0 and above we encountered the occasional system hang. Between ours and Rocket Software's engineers we isolated the problem to a utility subroutine that every user executes when logging into one of the main product accounts. The routine opens a dictionary (that happens to be dynamic) and reads a record to set a common variable for subsequent use by our products. Upon exiting the subroutine UniVerse performs its house keeping (locking the semaphore and checking if a split or merge operation is called for). Apparently during the house keeping some kind of deadlock situation was encountered (multiple processes all contending for the same semaphore lock). Sometimes when this occurred, after 30 minutes or so uvcleanupd would wake up and clear the locks.
The formula used by UniVerse to calculate the semaphore number for file access is:
MOD((groupaddress + inode#), GSEMNUM) + 1
For large files (with a large modulo) many users accessing many different records (likely hashing to different groups) should rarely contend for the same semaphore. However a relatively small dynamic file with a small modulo and with many processes accessing the same record ID the results of this formula may always return the same value. It was the latter scenario that this system (with over 600 users) was experiencing. The obvious solution (a work around really) was to resize the dictionary to be a static hashed file instead of a dynamic file.
While this type of lock contention can be avoided by changing the file type (an easy decision for a file dictionary that rarely changes) there is another way that users can be blocked from logging into UniVerse. It turns out that while the delivered uvlictool license tool (under UVHOME/bin) is running no one can login. If executed without piping to something that pauses the output then access is blocked for a short time... maybe one or two seconds at most. However if the command is piped to more or pg etc. (or anything that can momentarily prevent the command from completing) then no new user will be able to invoke the uvsh (or uvdls) shell to completion! So the next time you're checking the current license count, or monitoring workstations with multiple sessions (uvdls) and pause the uvlictool output be sure not to leave your desk for a break or leave work for the day. Also if you have a script that executes uvlictool and runs automatically take care that it does not run too frequently and make sure that the output path doesn't have the potential to result in pausing the uvlictool command.
No comments:
Post a Comment