Version 5.0.0 - August 3, 2020
Goldberg, D. W., Cockburn, M. G. (2010). Improving geocode accuracy with candidate selection criteria.
Transactions in GIS. Vol. 14 (S1), pp. 129-146. Available online at
http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9671.2010.01211.x/abstract
.
Wilson, J. P., Goldberg, D. W., Swift, J. N. (2009). Central Cancer Registry Geocoding Needs.
USC GIS Research Laboratory Technical Report No. 13.
Goldberg, D. W., Swift, J. N., Wilson, J. P. (2009). Address Standardization.
USC GIS Research Laboratory Technical Report No.12.
Goldberg, D. W., Swift, J. N., Wilson, J. P. (2009). The USC WebGIS Open Source Geocoding Platform.
USC GIS Research Laboratory Technical Report No.11.
Available online at:
http://spatial.usc.edu/wp-content/uploads/2014/03/gislabtr111.pdf
.
Swift, J. N., Goldberg, D. W., Wilson, J. P. (2008). Geocoding Best Practices: Review of Eight Commonly Used Geocoding Systems.
USC GIS Research Laboratory Technical Report No.10.
Available online at:
http://spatial.usc.edu/wp-content/uploads/2014/03/gislabtr10.pdf
.
Goldberg, D. W., Swift, J. N., Wilson, J. P. (2008). Geocoding Best Practices: Geocoding User Requirements Analysis.
USC GIS Research Laboratory Technical Report No.9.
Available online at:
http://spatial.usc.edu/wp-content/uploads/2014/03/gislabtr9.pdf
.
Goldberg, D. W., Swift, J. N., Wilson, J. P. (2008). Goecoding Best Practices: Reference Data, Input Data, and Feature Matching.
USC GIS Research Laboratory Technical Report No.8.
Available online at:
http://spatial.usc.edu/wp-content/uploads/2014/03/gislabtr8.pdf
.
Goldberg, D. W. (2008). A Geocoding Best Practices Guide.
Springfield, IL: North American Association of Central Cancer Registries.
Available online at
http://www.naaccr.org/LinkClick.aspx?fileticket=ZKekM8k_IQ0%3d&tabid=239&mid=699
.
Goldberg, D. W., Knoblock, C. A., Wilson, J. P. (2007). From text to geographic coordinates: The current state of geocoding.
Journal of the Urban and Regional Information Systems Association. Vol. 19 (1), pp. 33-46.
Available online at http://www.urisa.org/clientuploads/directory/Documents/Journal/Vol19No1.pdf
.
The Texas A&M Geoservices Geocoder currently supports US geocoding
-
Boundary Solutions 2012 National Parcel File
-
2010 US Census Bureau TIGER/Line Files
-
Edges
-
Places
-
Cities
-
Consolidated Cities
-
Zip Code Tabulation Area (ZCTA)
-
County Sub Regions
-
Counties
-
States
-
2008 US Census Bureau TIGER/Line Files
-
Edges
-
Places
-
Cities
-
Consolidated Cities
-
Zip Code Tabulation Area (ZCTA)
-
County Sub Regions
-
Counties
-
States
-
2005 US Census Bureau TIGER/Line Files
-
Edges
-
2000 US Census Bureau Cartographic Boundary Files
-
Places
-
Cities
-
Zip Code Tabulation Area (ZCTA)
-
Counties
-
Los Angeles County Assessor’s Parcel Files
-
USPS Tiger/ZIP + 4 Files
If you have other reference data sets you would like to use for and/or contribute to
the geocoding process on this site, we can incorporate them. Please
contact us for more information.
Deterministic Matching
This version of the Texas A&M Geoservices Geocoder performs strictly
deterministic matching, i.e., probabilistic matching is not attempted.
This means that if an exact match is not found in
a particular reference dataset, no match is returned for that reference data layer.
Attribute Relaxation
This option directs the geocoder to try alternative versions of the input data
in the case when an exact match can not be found. In particular, attributes of the
input address are removed from the query, first one at a time and then in combination
with each other. Attribute relaxation is performed (if the option is selected)
on the following address attributes:
Rank
|
Attribute
|
1) | Street predirectional |
2) | Street postdirectional |
3) | Street suffix |
4) | City |
5) | Zip |
An example of the first few iterations it will try are listed in the next table:
Number |
Pre |
Name |
Suffix |
Post |
Zip |
City |
State |
3620 | S | Vermont | Ave | N | 90089 | Los Angeles | Ca |
3620 | | Vermont | Ave | N | 90089 | Los Angeles | Ca |
3620 | S | Vermont | Ave | | 90089 | Los Angeles | Ca |
3620 | S | Vermont | | N | 90089 | Los Angeles | Ca |
3620 | | Vermont | Ave | | 90089 | Los Angeles | Ca |
3620 | | Vermont | | N | 90089 | Los Angeles | Ca |
3620 | S | Vermont | | | 90089 | Los Angeles | Ca |
3620 | | Vermont | | | 90089 | Los Angeles | Ca |
Substring Matching
This option directs the geocoder to use substring matching techniques to test
for matches in the database. Using this approach increases the likelihood of
finding a match if the input data or reference data are incomplete (the recall
is increased). However, using this also increases the chances that wrong results
are returned (the precision is decreased).
The following table shows examples of this strategy:
Query |
Reference Feature |
Match |
Vermont |
Vermont |
yes |
Verm |
Vermont |
yes |
Mont |
Vermont |
yes |
Soundex Matching
This option directs the geocoder to use soundex matching techniques to test
for matches in the database. Using this approach increases the likelihood of
finding a match if the input data or reference data have minor misspellings (the recall
is increased). However, using this also increases the chances that wrong results
are returned (the precision is decreased).
The following table shows examples of this strategy:
Query |
Query Soundex |
Reference Feature |
Reference Feature Soundex |
Match |
Vermont |
V655 |
Vermont |
V655 |
yes |
Vermond |
V655 |
Vermont |
V655 |
yes |
Varnend |
V655 |
Vermont |
V655 |
yes |
Linear Interpolation
The following linear interpolation techniques are used:
The following areal-unit interpolation techniques are used:
Unknown |
LinearInterpolation |
ArealInterpolation |
None |
NotAttempted |
Unknown |
LinearInterpolationAddressRange |
LinearInterpolationUniformLot |
LinearInterpolationActualLot |
LinearInterpolationMidPoint |
ArealInterpolationBoundingBoxCentroid |
ArealInterpolationConvexHullCentroid |
ArealInterpolationGeometricCentroid |
None |
NotAttempted |
-
Parcel centroid
-
An exact match was found to a parcel and its centroid is returned as output
-
Street segment
-
A match was found to the street segment and the address range
associated with the segment was used
to interpolate a point to return as output
-
ZCTA
-
A match was found to the ZIP portion of the address and its centroid was
returned as output
-
City
-
A match was found to the city portion of the address and its centroid was
returned as output
-
County subregion
-
A match was found to the city portion of the address
in the county subregion reference data set and its centroid was
returned as output
-
County
-
A match was found to the city portion of the address
in the county reference data set and its centroid was
returned as output
-
Unmatchable
-
A match could not be found for the input
Unknown |
GPS |
BuildingCentroid |
Building |
BuildingDoor |
Parcel |
StreetSegment |
StreetIntersection |
StreetCentroid |
USPSZipPlus5 |
USPSZipPlus4 |
USPSZipPlus3 |
USPSZipPlus2 |
USPSZipPlus1 |
USPSZip |
ZCTAPlus5 |
ZCTAPlus4 |
ZCTAPlus3 |
ZCTAPlus2 |
ZCTAPlus1 |
ZCTA |
City |
ConsolidatedCity |
MinorCivilDivision |
CountySubRegion |
County |
State |
Country |
Unmatchable |
-
Exact parcel centroid
-
An exact match was found to a parcel and its centroid is returned as output
-
Nearest parcel centroid
-
A match was found to the nearest parcel and its centroid is returned as output
-
Uniform lot interpolation
-
A match was found to the street segment and the number of lots on the segment was used
to interpolate a point to return as output
-
Address range interpolation
-
A match was found to the street segment and the address range
associated with the segment was used
to interpolate a point to return as output
-
ZCTA centroid
-
A match was found to the ZIP portion of the address and its centroid was
returned as output
-
City centroid
-
A match was found to the city portion of the address and its centroid was
returned as output
-
County subregion centroid
-
A match was found to the city portion of the address
in the county subregion reference data set and its centroid was
returned as output
-
County centroid
-
A match was found to the city portion of the address
in the county reference data set and its centroid was
returned as output
-
Unmatchable
-
A match could not be found for the input
Unmatchable |
Unknown |
AddressPoint |
GPS |
CountyParcel |
RoofTop |
ParcelCentroid |
PrimaryStructureEntrance |
DrivewayEntrance |
BuildingFrontDoor |
BuildingCentroid |
ExactParcelCentroidPoint |
ExactParcelCentroid |
NearestParcelCentroidPoint |
NearestParcelCentroid |
ActualLotInterpolation |
UniformLotInterpolation |
AddressRangeInterpolation |
StreetIntersection |
StreetCentroid |
ZCTAPlus5Centroid |
ZCTAPlus4Centroid |
ZCTAPlus3Centroid |
ZCTAPlus2Centroid |
ZCTAPlus1Centroid |
ZCTACentroid |
USPSZipPlus5LineCentroid |
USPSZipPlus4LineCentroid |
USPSZipPlus5AreaCentroid |
USPSZipPlus4AreaCentroid |
USPSZipPlus3AreaCentroid |
USPSZipPlus2AreaCentroid |
USPSZipPlus1AreaCentroid |
USPSZipAreaCentroid |
CityCentroid |
ConsolidatedCityCentroid |
CountySubdivisionCentroid |
CountyCentroid |
StateCentroid |
CountryCentroid |
DynamicFeatureCompositionCentroid |
The NAACCR GIS Coordinate Quality Codes provide an indication of the level of accuracy of a geocode.
These codes describe information about:
-
The quality of the reference feature matched to - parcel vs. street intersection
-
The type of input data submitted - ZIP centroid of an address vs. ZIP centroid of a PO Box
Code
|
Value
|
Description
|
98 |
Unknown |
Latitude and longitude are assigned, but coordinate quality is unknown |
00 |
AddressPoint |
Coordinates derived from local government-maintained address points, which are based on property parcel locations, not interpolation over a street segment’s address range |
01 |
GPS |
Coordinates assigned by Global Positioning System (GPS) |
02 |
Parcel |
Coordinates are match of house number and street, and based on property parcel location |
03 |
StreetSegmentInterpolation |
Coordinates are match of house number and street, interpolated over the matching street segment’s address range |
04 |
StreetIntersection |
Coordinates are street intersections |
05 |
StreetCentroid |
Coordinates are at mid-point of street segment (missing or invalid building number) |
06 |
AddressZIPPlus4Centroid |
Coordinates are address ZIP code+4 centroid |
07 |
AddressZIPPlus2Centroid |
Coordinates are address ZIP code+2 centroid |
08 |
ManualLookup |
Coordinates were obtained manually by looking up a location on a paper or electronic map |
09 |
AddressZIPCentroid |
Coordinates are address 5-digit ZIP code centroid |
10 |
POBoxZIPCentroid |
Coordinates are point ZIP code of Post Office Box or Rural Route |
11 |
CityCentroid |
Coordinates are centroid of address city (when address ZIP code is unknown or invalid, and there are multiple ZIP codes for the city) |
12 |
CountyCentroid |
Coordinates are centroid of county |
99 |
Unmatchable |
Latitude and longitude are not assigned, but geocoding was attempted; unable to assign coordinates based on available information |
|
Missing |
GIS Coordinate Quality not coded |
The NAACCR Census Tract Certainty Codes provide an indication of the level of accuracy one can expect from the Census data associated with a geocode.
These codes describe information about:
-
The quality of the reference feature matched to - parcel vs. street intersection
-
The type of input data submitted - ZIP centroid of an address vs. ZIP centroid of a PO Box
-
The relationship between the Census geographies matched to and the reference feature used to produce the geocode - Residence City Or ZIP With One Census Tract
Code
|
Value
|
Description
|
|
Unknown |
Unknown |
1 |
ResidenceStreetAddress |
Census tract based on complete and valid street address of residence |
2 |
ResidenceZIPPlus4 |
Census tract based on residence ZIP + 4 |
3 |
ResidenceZIPPlus2 |
Census tract based on residence ZIP + 2 |
4 |
ResidenceZIP |
Census tract based on residence ZIP code only |
5 |
POBoxZIP |
Census tract based on ZIP code of P.O. Box |
6 |
ResidenceCityOrZIPWithOneCensusTract |
Census tract/BNA based on residence city where city has only one census tract, or based on residence ZIP code where ZIP code has only one census tract |
9 |
Missing |
Not assigned, geocoding attempted |
99 |
Unmatchable |
Geocoding attempted, unable to assign |
|
NotAttempted |
Not assigned, geocoding not attempted |
the Texas A&M Geoservices Geocoder allows the user to choose if they want the "best" geocode
returned for an address to be chosen dynamically based on an accuracy
metric calculated by the geocoder, or statically always in the same order.
Details about each of the available methods can be found in the following publication.
Goldberg, D. W., Cockburn, M. G. (2010). Improving geocode accuracy with candidate selection criteria. Transactions in GIS. Vol. 14 (S1), pp. 129-146.
FeatureClassBased |
UncertaintySingleFeatureArea |
UncertaintyMultiFeatureGraviational |
UncertaintyMultiFeatureTopological |
The uncertainty hierarchy directs the geocoder to choose the geocode
with the lowest uncertainty
as the resulting "best geocode" that should be returned for an address.
This option will slow down the processing of your records.
When this
option is not selected, the "best geocode" will be chosen based on the first
geocode that matches in the following table (a variant of the NAACCR Hierarchy):
Exact parcel centroid |
Nearest parcel centroid |
Uniform lot interpolation |
Address range interpolation |
ZIP code centroid |
City centroid |
County subdivision centroid |
County centroid |
State centroid |
Country centroid |
All API's available from The Texas A&M Geoservices website (geocoding, address parsing, etc.) use the same set of query status codes from the following table.
Group |
Code Value |
Code Name |
Success | 200 | Success |
|
API Key Errors | 400 | API Key Error |
API Key Errors | 401 | API Key Missing |
API Key Errors | 402 | API Key Invalid |
API Key Errors | 403 | API Key Not Activated |
|
Non-Profit Errors | 450 | Non Profit Error |
Non-Profit Errors | 451 | Non Profit Not Confirmed |
|
Quota Errors | 470 | Quota Exceeded Error |
Quota Errors | 471 | Anonymous Quota Exceeded |
Quota Errors | 472 | Paid Quota Exceeded |
|
Versions Errors | 480 | Version Missing |
Versions Errors | 481 | Version Invalid |
|
Internal Errors | 500 | Failure |
Internal Errors | 501 | Internal Error |
|
Unknown Errors | 0 | Unknown |
-
StreetAddress
-
The matched input data was a postal street address
Example: 3620 South Vermont Ave, Los Angeles, CA 90089-0255
-
PostOfficeBox
-
The matched input data was a Post Office Box address
Example: PO Box 0255, Los Angeles, CA 90089-0255
-
RuralRoute
-
The matched input data was a Rural Route address
Example: RR 13 Box 2, Los Angeles, CA 90089-0255
-
StarRoute
-
The matched input data was a Star Route address
Example: Star Route 13 Box 2, Los Angeles, CA 90089-0255
-
HighwayContractRoute
-
The matched input data was a Highway Contract Route address
Example: HC 13 Box 2, Los Angeles, CA 90089-0255
-
Intersection
-
The matched input data was an intersection of two or more streets
Example: 36th and Vermont, Los Angeles, CA
-
NamedPlace
-
The matched input data was a named place
Example: USC GIS Research Laboratory, Los Angeles, CA
-
RelativeDirection
-
The matched input data was a relative direction
Example: 1 mile south of downtown Los Angeles
-
Unmatchable
-
A match could not be found for the input
Unmatchable |
Unknown |
StreetAddress |
PostOfficeBox |
RuralRoute |
StarRoute |
HighwayContractRoute |
Intersection |
NamedPlace |
RelativeDirection |
USPSZIP |
City |
State |
-
Exact
-
The input data exactly matched a feature in the reference data source:
input |
3620 S Vermont Ave, Los Angeles, CA 90089-0255 |
reference |
3620 S Vermont Ave, Los Angeles, CA 90089-0255 |
-
Relaxed
-
One or more of the input data attributes had to be removed to find a match in the reference data source:
input |
3620 Vermont Ave, Los Angeles, CA 90089-0255 |
reference |
3620 S Vermont Ave, Los Angeles, CA 90089-0255 |
- South is missing from input and is present in the reference
For more details, see the section on Address Relaxation above.
-
Soundex
-
One or more of the input data attributes had to be matched with soundex to find a match in the reference data source:
input |
3620 S Vermont Ave, Los Angeles, CA 90089-0255 |
reference |
3620 S Vermont Ave, Los Angeles, CA 90089-0255 |
- Soundex("Vermont") = V655 = Soundex("Vermont")
For more details, see the section on Soundex Matching above.
NoMatch |
Exact |
Relaxed |
Substring |
Soundex |
Composite |
Nearby |
Unknown |
-
Success
-
An exact match was found
-
Unmatchable
-
A match could not be found for the input
Unknown |
Success |
Ambiguous |
BrokenTie |
Composite |
Nearby |
LessThanMinimumScore |
InvalidFeature |
NullFeature |
Unmatchable |
ExceptionOccurred |
Unknown |
RevertToHierarchy |
FlipACoin |
DynamicFeatureComposition |
RegionalCharacteristics |
ReturnAll |
ChooseFirstOne |
Including this option in either the batch processing or API will result in the geometry
of the reference feature used for interpolation being returned along with the geocode
result. This option is only applicable when using the API in verbose mode, or when
the optional reference feature fields are selected in a batch process.
It is not recommended that this option be selected unless you have a specific need for the underlying reference feature geometry.
You can always return to the site and easily get the reference feature geometry for any specific geocode of interest.
Geometry format |
The geometry of the reference feature will be reported in OGC GML |
Geometry text size |
OGC GML is extremely verbose. This means that there will be a lot text returned if the geometry is a complex object such as a ZIP code, city or state |
CSV files in batch mode (and TSV) |
CSV files may become unusable due to the amount of text returned for a GML geometry |
Access database files in batch mode (mdb and accdb) |
Access data files may grow beyond the maximum file size of 2GB and become unusable due to the amount of text returned for a GML geometry |
The following list contains the set of known bugs for this release of the geocoding service.
We are constantly working to improve the service and will be addressing these bugs in future releases.
If you discover or suspect another bug, please
report it.
ID |
Type |
Description |
1 |
Input data |
Street intersection data are not supported |
2 |
Feature matching |
A state is required to obtain any type of street- or parcel-level match |
3 |
Feature matching |
Probabalistic feature matching is not supported |
4 |
Feature matching/Feature interpolation |
Street centroids are not returned for street level matches without address numbers |
5 |
Address parsing/Feature matching |
Zip codes with leading zero's get the leading zero's removed |
6 |
Address parsing/Feature matching |
City portions of an address are not normalized so you should pre-normalize them if possible, e.g., use "East Hanover" instead of "E Hanover" |