:::::::::::::: ../usgs/info :::::::::::::: USGS COLORADO GNIS DATA -- MISCELLANEOUS INFORMATION by Alan Silverstein last edit 970711 In June, 1985 I bought from the United States Geological Survey (USGS) for $75 a magtape containing a copy of the Geographic Names Information System (GNIS) Alphabetical Finding List for Colorado, in labelled ASCII format. This document is a collection of miscellaneous information relating to that tape, including: Unpacking Tape Usage and Format Converted Format Summary of Feature Classes Summary of Elevations Data Checks Performed Defects Noted but Not Corrected Defects Noted and Corrected Enhancements Added To manage this data I wrote some tools under HPUX, Hewlett- Packard's version of the AT&T UNIX(tm) operating system: conv.c convert raw data to manageable form convcheck.c check converted data as completely as possible (described below) format.c format converted data for printing nicely extract extract fields of interest from converted data for use with gcdist program pops print/plot population numbers (after I added them, see below) locplot.c plot data using HP graphics system quad convert lat/long to 7.5 minute quadrangle clip select points within a rectangular area plot.quad plot one quadrangle's data I also created some additional files (documents): info this document feat.classes FEATURE CLASS field values and definitions fips Federal Information Processing Standards numeric codes for Colorado counties Col.border Colorado boundary points, taken from Colorado Mapology by Erl Ellis (actually buried in the "fourteeners" document elsewhere) USGS COLORADO GNIS DATA: UNPACKING TAPE (on HPUX) # Had repeated trouble on one tape drive after record 731; # switched to another tape drive. $ dd < /dev/rmt > label # read label (first file). 0+3 blocks in 0+1 blocks out $ cat label # broke into lines, removed trailing blanks. VOL1ABQ981 PAYNE 1 HDR1COLOGAZ ABQ98100010001 85170 00000 000000OS370 HDR2F039900013330VG5061WW/QUESTRAN B 00 # 30 data records per tape block, 133 bytes each, with no newlines # or other separators. $ dd < /dev/rmt > raw ibs=3990 obs=39900 # whatever obs desired. 979+1 blocks in 97+1 blocks out $ ll raw -rw-rw-rw- 1 ajs adm 3906742 Jun 27 17:49 raw $ conv -v < raw | tail -2 # special conversion program. Lines read: 29374 Short lines: 0 $ vitals raw # vital statistics. 7824 38717 0 215472 3906742 raw $ dd < /dev/rmt > eof # nothing more on tape after this. 0+2 blocks in 0+1 blocks out $ cat eof # broke into lines; removed trailing blanks. EOF1COLOGAZ ABQ98100010001 85170 00000 000980OS370 EOF2F039900013330VG5061WW/QUESTRAN B 00 USGS COLORADO GNIS DATA: USAGE AND FORMAT All data on this tape is Phase I (see accompanying paper). There are no limitations on use of it; it's all public domain. The $75 pays for the cost of the tape, duplication, and mailing. "Once you have it we don't care what you do with it." The following raw format information includes column numbers and title lines. Also shows which data columns are always blank ('_'), mixed ('x'), and never blank ('X'), across all data. 0000000001111111111222222222233333333334444444444555555555566666666667 1234567890123456789012345678901234567890123456789012345678901234567890 FEATURE STATE NAME CLASS COUNTY A Diamond J Ranch locale 08107 A H Lateral canal 08085 A L Gulch valley 08093 _Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_XXXxxxxx__XXXXX_xxxxx_ 777777777888888888899999999990000000000111111111122222222223333 123456789012345678901234567890123456789012345678901234567890123 ELEV COORDINATE BGN FEET SOURCE MAP 405635N1070838W 7206 0016 382443N1074733W 1151 391438N1053821W 391613N1053755W 0769 0826 XXXXXXXXxXXXXXX_xxxx_xxxxx_xxxxxxxxxxxxxxx_XXXXxxxxxxxxxxxxxxxx 2- 47 NAME Official geographic name. 49- 56 FEATURE CLASS See Feature Class Definitions list (entered from paper list). 59- 63 STATE/COUNTY Primary county in which feature is located; see FIPS codes for Colorado (entered from paper list). 65- 69 Secondary state/county, if any. 71- 85 COORDINATE Latitude, longitude (ddmmssNdddmmssW). 87- 90 BGN Bureau of Geographic Names "date of decision on contested named feature", if any. 92- 96 ELEV FEET Taken from map; "must be on the map and near the feature". Left-justified. 98-112 SOURCE Source coordinate, if any, of linear feature. 114-132 MAP One to four map numbers. Don't have the full list except on microfiche. USGS COLORADO GNIS DATA: CONVERTED FORMAT By running conv -e to break records into HPUX text lines and line up fields on tab boundaries, then removing trailing spaces and running through unexpand -a, the size of the file was dropped from 3906742 bytes to 1943084 bytes without loss of data. It could be reduced by 172088 more bytes if all multiple tabs were compressed to single tabs. The mapping used was: input output field 2- 47 1- 46 NAME 49- 56 49- 56 FEATURE CLASS (changed to population for some ppls) 59- 69 65- 75 STATE/COUNTY 71- 85 81- 95 COORDINATE 87- 90 97-100 BGN 92- 96 105-109 ELEV FEET 98-112 113-127 SOURCE 114-132 129-147 MAP Vitals(1) for converted data: d0f9 21355 29374 215418 1943084 converted When compressed with compress3.0, the size was 648143 bytes, a 66.6% reduction. Vitals on data after error corrections made, latest on 890427 (described later): 0bc1 56675 29378 215691 1944576 converted USGS COLORADO GNIS DATA: SUMMARY OF FEATURE CLASSES These numbers are after error corrections (described later) were made to the data, but before population figures were added. 65 total known Feature Classes: 49 in data base and in GNIS Feature Class list 3 in data base but not in list (marked with "*") 13 in list but not in data base (count of zero) Occurrences in data base: 5347 stream 23 bend 4580 valley 19 bench 3114 summit 19 forest 3004 locale 14 glacier 1897 tank * 12 cape 1775 canal 12 cave 1354 lake 12 island 1263 mine 12 swamp 969 spring 9 civil 937 school 6 channel 780 ppl 4 arch 756 flat 3 gut 540 trail 2 hill * 482 cem 2 reserve 341 ridge 1 beach 315 gap 1 building 286 basin 1 crater 239 park 1 military * 206 cliff 1 rapids 188 other 0 arroyo 150 pillar 0 bar 97 church 0 geyser 91 airport 0 harbor 74 dam 0 isthmus 74 falls 0 lava 70 range 0 levee 67 well 0 plain 57 oilfield 0 reservoir 47 tunnel 0 sea 40 hosp 0 slope 29 area 0 tower 26 bridge 0 woods 25 bay 29374 total USGS COLORADO GNIS DATA: SUMMARY OF ELEVATIONS These numbers are after error corrections (described later) were made to the data, but before elevation figures were added from the Colorado state map. 24253 blank 2 clearly in error (see below); treated as blank 5119 nonblank (17.4%) (including three correctable errors) 29374 KF CNT PCT (KF = kilofeet) 3 126 2.5 4 452 8.8 5 465 9.1 (mainly cities, etc?) 6 341 6.7 7 512 10.0 8 694 13.6 (mainly summits?) 9 622 12.2 10 526 10.3 11 466 9.1 12 492 9.6 13 365 7.1 14 56 1.1 5117 100.1 USGS COLORADO GNIS DATA: DATA CHECKS PERFORMED Many errors I stumbled over while doing other things with the data. Most were found using the convcheck program, which checked the following items in the converted data. 1. Line length not too short or long (no errors; original data was healthy too). (To recheck, run the data through expand first.) 2. Field separators all blank (some errors found inside the MAP field). 3. NAME fields: - don't start with blank (no errors after conversion); - no multiple embedded blanks between words (no errors); - no word starts lower-case; - no two uppercase letters in a row. No other checks were performed! 4. FEATURE CLASS fields all in Appendix B list. 5. STATE/COUNTY fields all odd numbers between 08001-08123, inclusive, as defined in FIPS codes. 6. COORDINATE and SOURCE fields: - non-null COORDINATES (no errors found); - valid location in the format "ddmmssNdddmmssW"; - not outside state boundaries (as described later). No other checks were performed! And I found, by accident, at least one location which was in error by over a mile. 7. BGN fields all null or valid numbers in the range 1880-1981, inclusive (no errors found). (Now Challenger Point, 1987, is shown as an error.) 8. ELEV fields all null or valid numbers in the range 3350- 14433, inclusive (no errors found). 9. MAP fields all valid numbers 0000-9999. Did NOT check numbers against the (very long) list of valid map codes. USGS COLORADO GNIS DATA: DEFECTS NOTED BUT NOT CORRECTED [After the rest of this document was written, I exchanged mail with the head of the GNIS project. His comments are inserted in square brackets where relevant.] 1. "Tape Layout" page shipped with the tape is wrong. It appears to be very old information. ["We are puzzled as to why you did not receive a tape layout sheet or why your tape did not include files of feature classes and their definitions..."] 2. NAME field: Data seems to have been sorted case-independently under EBCDIC, not ASCII, collating sequence, so some lines containing digits are "out of ASCII order". 40 entries have "*" in the field, apparently signifying something special, but I don't know what. ["Temporary means of indicating that there is a diacritical mark somewhere within the name which will be added during a later editing stage."] "Mc Kinely School" probably should be "Mc Kinley". (Looks like there might be a LOT of silly typographical errors.) ["The data were manually keyed from the USGS topographic maps *except* for the geographic coordinates."] These MIGHT be errors; I'm not sure: Bimettalist Mine (Bimetalist?) Breckinridge Peak (Breckenridge?) Castastrophe Mine (Catastrophe?) Jamuary Reservoir (January?) Limekin Gulch (Limekiln?) Mapolean Pass (Napolean?) Middle Fork Puratoire River (Purgatoire?) Racoon Knob (Raccoon?) Rasberry Creek (Raspberry?) San Juan Bautisa Cemetery (Bautista?) Sentinal Mountain (Sentinel?) Vermilion Peak (Vermillion?) Williw Creek (Willow?) Punctuation is inconsistent. In particular some initials are followed by periods (e.g. "J. C. Johnson") and others are not. 3. FEATURE CLASS field: "cemetary" is abbreviated as "cem" in 482 entries; "hospital" is abbreviated as "hosp" in 40 entries. These appear to be official abbreviations, but differ from listings in Appendix B, FEATURE CLASS Definitions. 1897 entries have a FEATURE CLASS of "tank", 2 have "hill", and 1 has "military", none of which are mentioned in the list of 62 FEATURE CLASS definitions. Appendix B lists "reservoir" with a pseudonym of "tank", but in fact "tank" is used, not the official class name. 4. STATE/COUNTY fields: First STATE/COUNTY field: 115 are undocumented code 08125 [Yuma County; was left out of the USGS User's Guide by accident.] 16 are invalid Colorado (08) codes, 11 different values: 08016 08018 08042 08056 08066 08072 08088 0808K 08104 08106 08707 33 are non-Colorado codes, 14 different values: 04045 05051 06031 09097 09109 20043 20119 49001 49009 49019 49047 56007 56021 56037 Second STATE/COUNTY field: 4 are undocumented code 08125 342 are non-Colorado codes, 29 different values: 20023 20043 20071 20075 20181 20187 20199 31029 31033 31049 31057 31105 31135 35007 35039 35045 35055 35059 35129 40025 49009 49019 49037 49047 49049 56001 56007 56021 56037 Summary of STATE codes on invalid values in both fields: CNT ST Name 135 08 Colorado 133 35 New Mexico (these border on Colorado) 96 49 Utah 82 56 Wyoming 25 20 Kansas 20 40 Oklahoma 14 31 Nebraska 2 09 Connecticut (huh? these don't) 1 04 Arizona 1 05 Arkansas 1 06 California 5. COORDINATE field: 69 entries have "*PRIMARY COORD*" instead of a valid value. ["...was used to indicate that the mouth of a linear feature or the center of an areal feature is outside the bounds of the state in question..."] The following locations are more than 10 minutes of arc outside the approximate Colorado state boundaries (37-41N, 102-109W), and are probably erroneous: Latitude: 272438N Elwood Creek 303358N Hooker Mountain 323156N Mineota Ditch 355532N Little Silver Basin 360003N Mancos River 360617N Campo Cemetery 364649N Pump Canyon 364807N Navajo Reservoir 364915N Los Pinos River 415918N Rhea Ranch 423007N Road Spring 430136N Yampa River 454645N Johnson Ditch 472910N Bassett Cemetery 485203N Stratton School Longitude: 0980619W Morris Reservoir 0981922W Cheney Reservoir 0983455W Hells Hole 0983813W Thrailkill Spring 1091007W Too High Mine 1091045W Ryan Creek 1091110W Renegade Creek 1091311W Uranium Girl Mine 1091350W Coates Creek 1091439W Masters Post Office 1092308W Lion Canyon 1094043W White River 1095049W Willow Creek 1095305W Matheson Hill 1095748W Mc Kenzie Canyon 1095846W Spring Creek Idaho Springs (ppl) location appears to be grossly wrong (way too far east). 6. SOURCE field: The following locations are more than 10 minutes of arc outside the approximate Colorado state boundaries (37-41N, 102-109W), and are probably erroneous: Latitude: 322721N Berry Gulch 361312N Johnny Draw 411228N Dale Creek 422709N Johnson Canyon 424348N Craig Draw Longitude: 1091017W Stewart Gulch 1091443W La Sal Creek 1091808W Enoch Gulch 1095831W McKenzie Canyon 1095952W Coal Bed Canyon 1095954W Box Canyon 1105400W San Juan River 1180613W Negro Creek 7. NEW DEFECTS Noted since 890427 but not yet addressed: Sequndo -> Segundo? Duplicate entries in subset files, at least in one case arising from duplicate original data in converted.Z: stream 401858N 1072239W Second Creek summit 391507N 1062601W Bald Eagle Mountain (11913) 400823N 1071958W Pagoda Peak (11120) 404357N 1085710W Diamond Mountain (8542) valley 381228N 1090329W Island Canyon Duplicate due to typo: other 395700N 1080640W Little Hils Game Experiment Station Others: summit Beartrap Gulch is listed as a summit but it's not. locale Sleping Elephant Campground (7837) lake 403130N 1053143W Latman Lake (not at this lat/long) Locations of Bear Lake (9475), Tyndall Glacier, and possibly Andrews Pass in RMNP are all off by 1-2 miles. (Fixed Bear Lake in subset/lake only, 9707 with info from Mike Molloy.) Should be 4006? 400512N 1054356W Adams, Mount (12121) Three summits (Evans, Stratus, Yuma) are listed as "Mount ..." instead of "..., Mount". USGS COLORADO GNIS DATA: DEFECTS NOTED AND CORRECTED I manually corrected the errors listed below in my copy of the data. This serves as a record of changes made. 1. NAME field: Bad characters: East Mancos %river changed "%r" to "R" Second %creek changed "%c" to "C" Garfield @creek changed "@c" to "C" Holy Cross @ridge changed "@r" to "R" Lorencito Canyo^Z changed control-Z to "n" Wrong-case: carr uppercased "c" HAIRPIN DITCH corrected iSLAND cANYON reversed case J. c. Johnson Tunnel uppercased "c" lAKE fORK mINNESOTA cREEK North abeyta Creek uppercased "a" (Note: 11 other entries have NAME fields which legitimately start with a lower-case letter, e.g. "of the Clouds, Lake"). Incorrect names: Andersoville -> Andersonville Antonio (ppl) -> Antonito Bbernardino Creek -> Bernardino Brckenridge Ski Area -> Breckenridge Capitol Lkae -> Lake Connundrum Hot Springs -> Conundrum Cureranti Pass Trail -> Curecanti East River No 2 Dich -> Ditch Farista (ppl) -> Farisita Flat Collins... -> Fort (two cases) Gllcier Falls -> Glacier Grand Vrew Mesa -> View Hale Diich Aqueduct -> Ditch Highline Dich -> Ditch Johstown Reservoir -> Johnstown Josephine Lkae -> Lake Laporte Cemetary -> Cemetery Larkspur Citch -> Ditch Lee Lkae -> Lake Lonesome Lkae -> Lake Middllton Creek -> Middleton Neilson Gucch -> Gulch No Name Ccreek -> Creek Norrrie Guard Station -> Norrie North Fork North Creston Creek -> Crestone Rattllsnake Mesa -> Rattlesnake Saint Mary Magalene School -> Magdalene West Creek (ppl) -> Westcreek Many NAME fields have a blank after "Mc" or "Mac". Removed the blanks. (Not all "Mc" or "Mac" entries had the extraneous blank; it was inconsistent.) 5 NAME started with "Mac " 193 NAME started with "Mc " 8 NAME contained "Mc " Duplicate entries: Loveland Airport removed second one, with slightly different location and no elevation 2. FEATURE CLASS field: "bldg" and "miliiary" appear on one entry each. Those are invalid feature classes (typos). OLD NEW Buckley Air National Guard Base miliiary military Erie Filtration Plant bldg building "lakes" and "springs" are used as FEATURE CLASS on two entries each. Not supposed to be plural, according to Appendix B. Church Lakes lakes Twin Lakes lakes Bluff Springs springs Willow Springs springs 3. COORDINATE field: Commerce City 394880N 1045600W "80" -> "30" Chapman Gulch 391557N 4063800W -> 106... Link Creek 404657N 4055317W -> 105... Ridgeway 380910N 1074590W -> 1074530 On these, changed the last digit to "5": ["The letters I,J,K, and O occurring in the MAP (sic) field have no meaning and are errors."] Arch Rocks 402100N 105385KW Barnesville 402845N 104284KW Bear Mountain 393722N 105172IW Cahone School 374001N 108482KW Discovery Tunnel 393056N 106331KW Gwendolen Lake 395820N 107174IW Keystone Mountain 393411N 105552JW New York Lake 392939N 106361IW Paymaster Mine 393720N 105475KW Pennock Pass 403438N 105304JW Puett Reservoir 373306N 108175IW Quaitie Spring 372145N 107304JW Rangely 400516N 108481IW Rowe Glacier 402918N 105384OW Walls Gulch 372459N 107582OW Woods 403517N 104575KW The longitude is wrong for Mount Columbia; it is 1061849W, but should be 1061749W. (Looks like a mistyped digit; wonder how often that might happen? I thought all the data was directly digitized?) [It was, so it's even more puzzling.] Tabeguache Peak location and elevation was apparently taken from the wrong summit. Corrected it to measured location and map elevation: old: 383736N 1061536W 13908 new: 383732N 1061501W 14155 4. SOURCE field: Big Spring Draw 108402IW changed "I" to "5" Cottonwood Canyon 357304N removed SOURCE field Indian Creek 1084297W removed SOURCE field Open Draw 383974N removed SOURCE field Ruedi Creek 106491/W changed "/" to "5" 5. ELEV FEET field: East Delaney Lake 08113 removed leading 0 Bald Hill 97170 -> blank Grover 507 -> 5071 (ref: Colorado map) Red Elephant Hill 1316 -> blank Steamboat Springs Airport 68799 -> 6879 (see Steamboat) Elevations for these Fourteeners were probably to benchmarks, in every case lower than the values in the USGS pamphlet "Elevations and Distances in the United States". Changed the elevations to reflect best known values from that source: OLD NEW Belford, Mount 14196 14197 Columbia, Mount 14071 14073 Eolus, Mount 14083 14084 Harvard, Mount 14414 14420 Huron Peak 14003 14005 of the Holy Cross, Mount 14003 14005 Pikes Peak 14009 14110 Shavano Peak 14225 14229 Windom Peak 14082 14087 Yale, Mount 14194 14196 6. MAP field: 151 (?) entries have missing blanks between number fields (they run together). Added missing blanks, then found three more entries with obvious errors, which were removed: Deadmans Gulch 542 missing digit Rock School 4 15 17 ? South Platte River 0,O0 typo? (Note: Checked that all remaining MAP fields are exactly four digits, but did not compare them with valid numbers.) USGS COLORADO GNIS DATA: ENHANCEMENTS MADE 1. Added missing Fourteeners and major sub-peaks based on measuring 7.5 minute maps. Copied best available data for STATE/COUNTY and MAP fields. Elbert and Massive sub-peaks are not officially-named places. Challenger Point location is from USGS, 8705; height may be 40' higher (never been surveyed). North Massive 14320 391144N 1062913W Sneffels, Mount 14150 380013N 1074730W South Elbert 14134 390615N 1062625W South Massive 14132 391045N 1062804W Challenger Point 14080 375849N 1053622W 2. Converted FEATURE CLASS "ppl" to a 1980 census (March, 1981 Advance Report) population figure for 258 incorporated areas listed with population figures in the index of the Colorado official state map. Also added elevations for all cities listed in the map, where not already present in the GNIS, but did not change existing ones despite some large differences (as much as 400') between the index and GNIS values. Vancorum elevation was added to East Vancorum and West Vancorum. By virtue of added population figures, converted these "locale"s to "ppl"s: Avon Kit Carson Winter Park These 23 places appear in the Colorado map index but not in the database (even with NAMEs permuted in various ways). Populations (if known) and elevations are also shown. Arapahoe . 4050 Cheney Center . 3570 Copper Mountain . 9680 Deer Ridge . 8930 Dowd . 7750 Echo Lake . 10650 Firstview . 4580 Fort Lewis . 7610 Heeney . 7870 Karval . 5070 Lochbuie 895 4980 Log Lane Village 709 4330 Mosca . 7550 Mt. Crested Butte Est. 272 8945 North Pole . 7600 Parachute 338 5095 Pueblo West . 4960 Royal Gorge . 6400 San Francisco . 9200 Segundo . 6480 Slater . 6540 Snowmass Village 999 8575 The Forks . 5880 Summary after changes made: 258 ppl's with population: 2,116,380 525 without known population 783 total ppl's 5 missing with population: 3,213 total known population: 2,119,593 (73.4%) total other population: 769,241 (26.6%) total state population: 2,888,834 (also from Colorado map) 3. Added temperatures (degrees Fahrenheit) after names of 12 hot springs in the form: with a blank before the "<". Only added them to entries of type "spring" where the name and location matched. Did not enter temperatures for 35 other springs which were not found, or which were not of type "spring" (usually they were "locale" or "ppl" instead). Note that "<" and ">" did not otherwise appear in the data. Temperatures are from "Great Hot Springs of the West" by Bill Kaysing, 1984.