Pandas

ArcGIS to Pandas Data Frame v2.0

Joel McCune

Jun 12, 2017 • 1 min read

One of the most popular posts on my blog discusses loading data from an ArcGIS table, either spatial (a Feature Class) or even nonspatial, to a Pandas DataFrame to take advantage of the powerful data analysis tools available in Python, such as SciKit-Learn or Keras paired with TensorFlow. This previous post, ArcGIS to Pandas DataFrame, details how to utilize the ArcGIS FeatureClassToNumPyArray tool as the method to get the data into a NumPy array to be loaded into a Pandas DataFrame. This, as I recently discovered, unfortunately does not scale very well.

The SearchCursor in the ArcPy Data Access module enables the creation of a filtered list of rows to load into a Pandas DataFrame through a list comprehension much more efficiently. While this will not scale indefinitely since it still does load all rows into one data structure, a list, before loading it into a data frame, this method will scale much better since the field list is limited only to the parameters of interest.

import arcpy
import pandas as pd

table = r'C:\path\to\table'
field_list = ['field01', 'field02']

df = pd.DataFrame([row for row in arcpy.da.SearchCursor(table, field_list)])

References:

Pandas DataFrame: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
ArcGIS Data Access Search Cursor: http://pro.arcgis.com/en/pro-app/arcpy/data-access/searchcursor-class.htm