There’s a bug in Tivoli Storage Manager, where, given a set of tapes to process in order to reclaim the space occupied by expired data, it fails to complete the task because it runs out of scratch tapes. Then manual intervention is required to finish the job. Let me describe the situation, and let you describe the bug. It’s an elementary problem.
You have two empty containers, and five partially full containers, as shown below. You must move the contents of the five partially full containers to the empty containers, leaving as many free containers as possible when you are done. Place the containers in the order in which you will move the contents.
TSM‘s operations are coordinated by an internal database, for which there’s an SQL interface. So, let’s say you tell TSM to reclaim volumes where the percentage reclaimable is greater than 33%, it will issue a query something like this, and work with those volumes.
SELECT volume_name FROM volumes WHERE pct_reclaim > 33
That statement returns the following.
VOLUME_NAME ------------------ 001004 001010 001077 001095 001121 001141 001146 001155
But let’s see what percentage reclaimable those volumes are.
SELECT volume_name,pct_reclaim FROM volumes WHERE pct_reclaim > 33
VOLUME_NAME PCT_RECLAIM ------------------ ----------- 001004 33.2 001010 55.6 001077 33.2 001095 33.3 001121 43.2 001141 35.6 001146 36.7 001155 74.8
Remember the shuffling exercise from earlier?
SELECT volume_name,pct_reclaim FROM volumes WHERE pct_reclaim > 33 ORDER BY pct_reclaim DESC
VOLUME_NAME PCT_RECLAIM ------------------ ----------- 001155 74.8 001010 55.6 001121 43.2 001146 36.7 001141 35.6 001095 33.3 001004 33.2 001077 33.2
I simplified a bit above. TSM attempts to optimize tape operations. The problem, however, is still how it sorts the list. Here’s an example from actual operations last night. Before space reclamation began, we had the following volumes available for reclamation.
VOLUME_NAME PCT_RECLAIM ------------------ ----------- 001001 47.2 001093 43.2 001078 40.7 001198 38.4 001163 37.1 001067 36.9 001016 35.5 001004 35.3
After space reclamation ran, and used all four of the available scratch tapes, we were left with this.
VOLUME_NAME PCT_RECLAIM ------------------ ----------- 001078 97.2 001001 88.9 001198 82.5 001016 74.2 001093 71.5 001163 68.5 001004 59.2