Bulk Address Validation: Command-Line Interface
If you have address lists larger than 25,000 records, and you have some experience with the command line, this interface might become your new best friend. It processes lists of any size, standardizing and validating addresses and providing lots of relevant data. And it does so with astonishing speed. (If you're not yet familiar with the command line, you can standardize and validate address lists larger than 25,000 records by splitting them up and submitting them to our Web Interface.)
- On Scripting and Automation
- Preparing Your Input File
- Using the Interface
- The Output File
- The Log File
- Command-Line Parameters
An Important Note On Scripting and Automation
This Command-Line Interface tool is provided as a convenience for (mostly) non-computer programmers seeking to process large quantities of addresses formatted as CSV records. It is intended that it will be invoked manually by human users typing at a command prompt (not the most friendly of user experiences, we get it). The use-case of deploying this tool into an automated environment for the processing of ad-hoc address data is not supported. This constraint is based on how we provide software updates for this tool. If you need to process address data from deployed software running autonomously we recommend our officially supported SDKs. For those who seek an even more direct HTTP integration we also provide detailed US Street Address API documentation.
You can download (free) the Command-Line Interface for the following platforms:
After downloading one of the above packages, extract the contents of the archive to your desktop. You'll see a SmartyList folder containing the following files:
smartylistThis is the application. Instead of double-clicking it, you will access it from the command line.
sample-input.csvThis is a simple address list for your reference.
sample-output.csvThis is the output produced by processing the sample-input.csv file above.
change-log.txtA log of recent changes made by the software developers.
DO-NOT-README.txtActually, please read it.
Power users: Feel free to copy or move
smartylist to wherever is convenient. On a
Linux machine you might put it in
/usr/local/bin or somewhere else that is already in your
Preparing Your Input File
Save your input file as a CSV (comma-separated or tab-delimited) text file, within the SmartyList folder on your desktop. Arrange your data fields according to one of the patterns shown below. (You can also see an example in the sample input file included with the download.) The first row MUST consist of field names, spelled exactly as you see here.
address (entire address in a single field)
Note: An additional column for secondary (apt/ste/etc.) can be added to any of the first three examples above. (e.g.,
The acceptable input fields for this tool are the same as the acceptable input fields for the US Street Address API.
If you'd like to see an example of a well-formatted input file, just take a peek at this sample spreadsheet.
In addition, you may include a column named
match with a valid match value
outlined in the acceptable input fields.
match column is omitted or the column value in the row
is empty, the default match value of
invalid is assumed.
Optionally, you can include fields that contain non-address data (like ID number or business name). All your input data will be returned untouched as part of the output. (If you do include non-address data, be sure to give those data fields non-address names.)
One final consideration: Make sure your list doesn't include blank lines (except at the end). By "blank lines" we mean lines that have no delimiters (commas or tabs) and no data except a carriage return character (and/or line feed character). Blank lines can cause line numbers to output incorrectly, which makes pasting back into a spreadsheet a bit tricky. If you insist on having blank lines, make sure each record has an 'ID' field containing a unique value.
Using the Interface
Open your favorite command-line application, and use the "change directory" command to navigate to the directory where your Command Line Interface files reside. On Windows, we recommend running this command as an administrator. This is what that might look like:
Three specific command-line parameters are required in order to process a list:
-input. (To find your
-auth-token, open the API Keys tab of your
account and look under the heading of Secret Keys.) The
-input parameter tells the tool where
your input file is. If you placed your input file inside of the SmartyList folder, the complete command to
process it might look like this:
smartylist -auth-id="123" -auth-token="Abc" -input="your_file"
./smartylist -auth-id="123" -auth-token="Abc" -input="your_file"
We suggest you try a short list first, to make sure everything is working as expected. When you run the command, the terminal will first display your current configuration settings, so you can verify that they are as desired. It will also list your input field names, and below those, the matching data type for each. Make sure these are correct.
Finally, the prompt will ask if everything appears to be in order. If everything looks right, type "y" then hit "enter." During processing, the terminal will display a progress bar. (Although, if your list is small, the job will be done almost instantly.)
The Output File
By default, the output file will be placed next to the input file, and it will be named like the input file,
except with "-output" appended. (If you wish, you can specify a different output directory using the
-output command-line parameter.)
After opening the output file, you will see all of your original data fields on the left, followed by an empty field, followed by our output data fields on the right, with field names in brackets. If you need help understanding the many output fields, please see Address Output Fields.
The Log File
Every time you process a list with the Command-Line Interface, it will produce a log file and place it next to the corresponding input file. The name of the log file will follow this pattern:
The file will contain all the information displayed by the terminal before processing, as well as a precise play-by-play of the tool's various actions. In the unlikely event that your list fails to process, check the log file for the gory details of what happened. If you contact Support with questions, they may ask to see this file in order to aid in the debugging process.
Here we list all the command-line parameters that can be used with our Command-Line Interface. As
explained above, the first three parameters listed below are all that are required to
process a list. The others are optional; you can employ them to customize the tool's functionality. To use them,
simply list them when you run
smartylist at the command prompt, following this model:
smartylist -[parameter] -[another-parameter]
The auth-id value (or name of environment variable) to use for API requests.
The auth-token value (or name of environment variable) to use for API requests.
The path to the input file which has addresses you want to validate.
If provided, this is where bulk validation tool will place the output file containing the results of processing your input file. If not provided, the tool will place the output alongside the input.
If desired, you can tell the bulk validation tool where to put the diagnostic log file. If this parameter is not provided, the tool will place the log file alongside the input file.
The base URL to use for API requests if you are pointing to an onsite API installation. If you are using our regular cloud service, this parameter is not necessary.
When you provide this parameter, the tool will override any values in the
matchcolumn of the input file. Valid values are the same as the
matchparameter for the US Street Address API.
The URL of your proxy, if one has been configured for your network. In most cases this flag is not necessary.
Tells the tool to squelch all diagnostic output and process the list without a confirmation prompt if possible. (No value needed.)
If your network connection is slow you may receive timeout errors during execution such as
context deadline exceeded (Client.Timeout or context cancellation while reading body). This parameter can help prevent those errors by allowing more time for the response to be received from the server. The default value is 5 (5 seconds). (No value needed.)
When you provide this parameter, the tool simply prints the version of the application to
stdoutand exits. (No value needed.)
Updates (Pay attention, this is important!)
Try this command at the command prompt:
(Mac/Linux users may need to insert
./ in front of the word
The version number you see is the semantic version number of your copy of the
application. Each of the three dot-delimited numbers is significant:
The first of the three numbers in the version output is the "major" version number. If we need to release a new major version, any copies of the old version will be automatically disabled, requiring you to download and install the latest version before processing any additional lists. (Read that last sentence again...slowly...just to make sure it sinks in.) This is not something we will do often and certainly not ever without extensive consideration.
The second of the three numbers in the version output is the "minor" version number. Incrementing this
number means we have released new functionality that is still backwards-compatible. It would behoove you to download and install the latest version. Until you do, a
message will be sent to
stderr and a non-zero exit status will be returned by the application as a
signal that something is amiss. The application will continue to process your lists.
This third number refers to patches and bug fixes—corrections to existing behavior. If this number doesn't
match, it would be a good idea (probably worth a promotion!) for you to download and install the latest version so you have the most current and correct software. In
this situation, a message will be sent to
stdout as a signal that something is amiss. The
application will continue to process your lists.
New releases are announced in our open-source Changelog repository.