What is ArcPY? Part1

Arcpy generated Point Feature Data from Ground water contamination, PFAS chemical levels and MNPCA clean up activity sites

  • Multiple Spreadsheet data sources converted to points in Arcpy and mapped with ArcGIS pro

What is Arcpy?

Arcpy is a Python library for automating GIS tasks in Arc GIS Pro

  • Collection of modules, functions, and classes from the ArcGIS toolbox
  • It allows you to perform geoprocessing, analysis, data management and mapping automation using python
In [ ]:
import arcpy # allows for access to ArcGIS pro geoprocessing tools and workflow automation

import os # enables interaction with local system resources (paths to folders and files)

Set up a workspace

arcpy.env.workspace

  • Use-case: get or set the current workspace environment in which to perform arcpy operations
  • Breaking down the synax

    • arcpy --> library
    • env --> class that manages environment settings
    • workspace --> attribute of the env class within the arcpy library
  • Note the general pattern: library.class.attribute

In [72]:
##===================== Define the path to script and project folder

# Gets folder where the script is located , use try/ except to handle different environments

try:
    # works on normal python .py scripts
    script_path = os.path.dirname(os.path.abspath(__file__)) 
    ("running a python script")
except NameError:
    # works in a notebook environment
    script_path= os.getcwd()
    ("running in a notebook")

print("Script is here", script_path)

# Get the folder inside the script folder
dir_path = os.path.join(script_path, "01 Data")

print("Directory path set to : ", dir_path)


# make sure the path exists on your operating system, if not create it
if not os.path.exists(dir_path):
    os.makedirs(dir_path) 

# set the folder path as the working environment, and store it for later use
try:
    arcpy.env.workspace = dir_path
    wksp = arcpy.env.workspace
except Exception as e:
    print(f"Error setting up the path, {e}, Exception type: {type(e).__name__}") # python's built-in error messaging, {e} prints description and type(e).__name__ category

print("Working environment is here",wksp)
Script is here c:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY
Directory path set to :  c:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\01 Data
Working environment is here c:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\01 Data
In [ ]:
## We often end up re-running the workflow to tweak the output
## Modify the env attribute to allow overwriting output files
arcpy.env.overwriteOutput = True

Arcpy Documentation

  • Before getting into the weeds, always good to have ArcGIS documentation on hand
  • Link to ArcGIS Docs
  • Here you can search a tool by name and then copy the required syntax from the python tab
  • paste the code into your script for reference
In [ ]:
### Not bad...But why not load and search the documentation directly in this workbook? 

Load ArcGIS Documentation directly into the Notebook

In [11]:
from IPython.display import IFrame # a module for controlling notebook outputs, allows you to embed images, video, webpages with the Iframe function

# ArcGIS Pro documentation URL for a specific tool
tool_url = "https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/xy-table-to-point.htm"

# Display the documentation inside Jupyter Notebook
IFrame(tool_url, width="100%", height="600px") # iframe can be used to display local or online webpages, documents, reports, visualizations , videos
Out[11]:

Listing and searching toolbox and tool names

In [21]:
#get a list of all toolboxes
toolboxes=arcpy.ListToolboxes("*")

print("Availiable Toolboxes", toolboxes)
Availiable Toolboxes ['Space Time Pattern Mining Tools(stpm)', 'Aviation Tools(aviation)', '3D Analyst Tools(3d)', 'Territory Design Tools(td)', 'Defense Tools(defense)', 'Business Analyst Tools(ba)', 'Analysis Tools(analysis)', 'Bathymetry Tools(bathymetry)', 'Public Transit Tools(transit)', 'Cartography Tools(cartography)', 'Conversion Tools(conversion)', 'Crime Analysis and Safety Tools(ca)', 'Data Interoperability Tools(interop)', 'Linear Referencing Tools(lr)', 'Data Management Tools(management)', 'Data Reviewer Tools(Reviewer)', 'Editing Tools(edit)', 'Server Tools(server)', 'GeoAI Tools(geoai)', 'GeoAnalytics Desktop Tools(gapro)', 'GeoAnalytics Tools(geoanalytics)', 'Geocoding Tools(geocoding)', 'Network Analyst Tools(na)', 'Geostatistical Analyst Tools(ga)', 'Image Analyst Tools(ia)', 'Indoor Positioning Tools(indoorpositioning)', 'Indoors Tools(indoors)', 'Intelligence Tools(intelligence)', 'Knowledge Graph Tools(kg)', 'Location Referencing Tools(locref)', 'Maritime Tools(maritime)', 'Spatial Statistics Tools(stats)', 'Model Tools(mb)', 'Multidimension Tools(md)', 'Network Diagram Tools(nd)', 'Utility Network Tools(un)', 'Oriented Imagery Tools(oi)', 'Parcel Tools(parcel)', 'Raster Analysis Tools(ra)', 'ReadyToUseServiceTools(agolservices)', 'Reality Mapping Tools(rm)', 'Spatial Analyst Tools(sa)', 'Standard Feature Analysis Tools(sfa)', 'Topographic Production Tools(topographic)', 'Trace Network Tools(tn)', 'Workflow Manager Tools(wmx)']
In [ ]:
del toolboxes # delete the variable
In [30]:
## access all tool names

# get list of all tools
tools =arcpy.ListTools("*") # the asterisk is a wildcard that can be replaced with a string of the name  or part of what you are looking for e.g., "*Ma"

print("Availiable Tools", tools)
Availiable Tools ['MakeTerritorySolutionLayer_td', 'MakeSuitabilityAnalysisLayer_ba', 'MapServerCacheTilingSchemeToPolygons_cartography', 'MapToKML_conversion', 'MakeRouteEventLayer_lr', 'MakeAggregationQueryLayer_management', 'MakeBuildingLayer_management', 'MakeFeatureLayer_management', 'MakeImageServerLayer_management', 'MakeLasDatasetLayer_management', 'MakeMosaicLayer_management', 'MakeQueryLayer_management', 'MakeQueryTable_management', 'MakeRasterLayer_management', 'MakeSceneLayer_management', 'MakeTableView_management', 'MakeTinLayer_management', 'MakeTrajectoryLayer_management', 'MakeWCSLayer_management', 'MakeXYEventLayer_management', 'ManageFeatureBinCache_management', 'ManageTileCache_management', 'MatchControlPoints_management', 'MatchLayerSymbologyToAStyle_management', 'MatchPhotosToRowsByTime_management', 'ManageMapServerCacheScales_server', 'ManageMapServerCacheStatus_server', 'ManageMapServerCacheTiles_server', 'MakeClosestFacilityAnalysisLayer_na', 'MakeLastMileDeliveryAnalysisLayer_na', 'MakeLocationAllocationAnalysisLayer_na', 'MakeNetworkDatasetLayer_na', 'MakeODCostMatrixAnalysisLayer_na', 'MakeRouteAnalysisLayer_na', 'MakeServiceAreaAnalysisLayer_na', 'MakeVehicleRoutingProblemAnalysisLayer_na', 'MakeMultidimensionalRasterLayer_md', 'MakeMultidimensionalVoxelLayer_md', 'MakeNetCDFFeatureLayer_md', 'MakeNetCDFRasterLayer_md', 'MakeNetCDFTableView_md', 'MakeOPeNDAPRasterLayer_md', 'ManageMultidimensionalRaster_md', 'MakeDiagramLayer_nd', 'MatchControlPoints_rm', 'MajorityFilter_sa', 'MakeGridsAndGraticulesLayer_topographic', 'MakeMasksFromRules_topographic']
In [34]:
# get list of all tools within a specific toolbox
tools =arcpy.ListTools("*_management") # * is a wildcard that can be replaced with a string of the name or part of what you are looking for e.g., "*Ma"

print("Availiable Tools", tools)
Availiable Tools ['Add3DFormats_management', 'AddAttachments_management', 'AddAttributeRule_management', 'AddCodedValueToDomain_management', 'AddColormap_management', 'AddContingentValue_management', 'AddDataToTrajectoryDataset_management', 'AddFeatureClassToTopology_management', 'AddField_management', 'AddFieldConflictFilter_management', 'AddFields_management', 'AddFilesToLasDataset_management', 'AddGlobalIDs_management', 'AddGPSMetadataFields_management', 'AddIncrementingIDField_management', 'AddIndex_management', 'AddItemsToCatalogDataset_management', 'AddJoin_management', 'AddPortalItemsToCatalogDataset_management', 'AddRastersToMosaicDataset_management', 'AddRelate_management', 'AddRuleToRelationshipClass_management', 'AddRuleToTopology_management', 'AddSpatialIndex_management', 'AddSpatialJoin_management', 'AddSubtype_management', 'AddXY_management', 'Adjust3DZ_management', 'AlterAttributeRule_management', 'AlterDomain_management', 'AlterField_management', 'AlterFieldGroup_management', 'AlterMosaicDatasetSchema_management', 'AlterVersion_management', 'Analyze_management', 'AnalyzeControlPoints_management', 'AnalyzeDatasets_management', 'AnalyzeMosaicDataset_management', 'AnalyzeToolboxForVersion_management', 'AnalyzeToolsForPro_management', 'Append_management', 'AppendAnnotation_management', 'AppendControlPoints_management', 'ApplyBlockAdjustment_management', 'ApplySymbologyFromLayer_management', 'AssignDefaultToField_management', 'AssignDomainToField_management', 'BatchBuildPyramids_management', 'BatchCalculateStatistics_management', 'BatchProject_management', 'BatchUpdateFields_management', 'BearingDistanceToLine_management', 'BuildBoundary_management', 'BuildFootprints_management', 'BuildLasDatasetPyramid_management', 'BuildMosaicDatasetItemCache_management', 'BuildOverviews_management', 'BuildPyramids_management', 'BuildPyramidsandStatistics_management', 'BuildRasterAttributeTable_management', 'BuildSeamlines_management', 'BuildStereoModel_management', 'CalculateCellSizeRanges_management', 'CalculateEndTime_management', 'CalculateField_management', 'CalculateFields_management', 'CalculateGeometryAttributes_management', 'CalculateStatistics_management', 'ChangePrivileges_management', 'ChangeVersion_management', 'CheckGeometry_management', 'ClearPixelCache_management', 'ClearWorkspaceCache_management', 'Clip_management', 'ColorBalanceMosaicDataset_management', 'Compact_management', 'CompareReplicaSchema_management', 'CompositeBands_management', 'Compress_management', 'CompressFileGeodatabaseData_management', 'ComputeBlockAdjustment_management', 'ComputeCameraModel_management', 'ComputeControlPoints_management', 'ComputeDepthMap_management', 'ComputeDirtyArea_management', 'ComputeFiducials_management', 'ComputeMosaicCandidates_management', 'ComputePansharpenWeights_management', 'ComputeTiePoints_management', 'ConfigureGeodatabaseLogFileTables_management', 'ConsolidateLayer_management', 'ConsolidateLocator_management', 'ConsolidateMap_management', 'ConsolidateProject_management', 'ConsolidateToolbox_management', 'ConvertCoordinateNotation_management', 'ConvertRasterFunctionTemplate_management', 'ConvertSchemaReport_management', 'ConvertTimeField_management', 'ConvertTimeZone_management', 'Copy_management', 'CopyFeatures_management', 'CopyRaster_management', 'CopyRows_management', 'Create3DObjectSceneLayerPackage_management', 'CreateBuildingSceneLayerPackage_management', 'CreateCatalogDataset_management', 'CreateCloudStorageConnectionFile_management', 'CreateColorComposite_management', 'CreateCustomGeoTransformation_management', 'CreateCustomVerticalTransformation_management', 'CreateDatabaseConnection_management', 'CreateDatabaseConnectionString_management', 'CreateDatabaseSequence_management', 'CreateDatabaseUser_management', 'CreateDatabaseView_management', 'CreateDataLoadingWorkspace_management', 'CreateDomain_management', 'CreateEnterpriseGeodatabase_management', 'CreateFeatureclass_management', 'CreateFeatureDataset_management', 'CreateFieldGroup_management', 'CreateFileGDB_management', 'CreateFishnet_management', 'CreateFolder_management', 'CreateIntegratedMeshSceneLayerPackage_management', 'CreateLasDataset_management', 'CreateMapTilePackage_management', 'CreateMobileGDB_management', 'CreateMobileMapPackage_management', 'CreateMobileScenePackage_management', 'CreateMosaicDataset_management', 'CreateOrthoCorrectedRasterDataset_management', 'CreatePansharpenedRasterDataset_management', 'CreatePointCloudSceneLayerPackage_management', 'CreatePointSceneLayerPackage_management', 'CreateRandomPoints_management', 'CreateRandomRaster_management', 'CreateRasterDataset_management', 'CreateReferencedMosaicDataset_management', 'CreateRelationshipClass_management', 'CreateReplica_management', 'CreateReplicaFromServer_management', 'CreateRole_management', 'CreateSpatiallyBalancedPoints_management', 'CreateSpatialReference_management', 'CreateSpatialSamplingLocations_management', 'CreateSpatialType_management', 'CreateSQLiteDatabase_management', 'CreateTable_management', 'CreateTopology_management', 'CreateTrajectoryDataset_management', 'CreateUnRegisteredFeatureclass_management', 'CreateUnRegisteredTable_management', 'CreateVectorTileIndex_management', 'CreateVectorTilePackage_management', 'CreateVersion_management', 'CreateVoxelSceneLayerContent_management', 'DefineMosaicDatasetNoData_management', 'DefineOverviews_management', 'DefineProjection_management', 'Delete_management', 'DeleteAttributeRule_management', 'DeleteCodedValueFromDomain_management', 'DeleteColormap_management', 'DeleteDatabaseSequence_management', 'DeleteDomain_management', 'DeleteFeatures_management', 'DeleteField_management', 'DeleteFieldGroup_management', 'DeleteIdentical_management', 'DeleteMosaicDataset_management', 'DeleteRasterAttributeTable_management', 'DeleteRows_management', 'DeleteSchemaGeodatabase_management', 'DeleteVersion_management', 'DetectFeatureChanges_management', 'DiagnoseVersionMetadata_management', 'DiagnoseVersionTables_management', 'Dice_management', 'DisableArchiving_management', 'DisableAttachments_management', 'DisableAttributeRules_management', 'DisableCOGO_management', 'DisableEditorTracking_management', 'DisableFeatureBinning_management', 'DisableReplicaTracking_management', 'Dissolve_management', 'DomainToTable_management', 'DowngradeAttachments_management', 'DownloadRasters_management', 'EditRasterFunction_management', 'Eliminate_management', 'EliminatePolygonPart_management', 'EnableArchiving_management', 'EnableAttachments_management', 'EnableAttributeRules_management', 'EnableCOGO_management', 'EnableEditorTracking_management', 'EnableEnterpriseGeodatabase_management', 'EnableFeatureBinning_management', 'EnableReplicaTracking_management', 'EncodeField_management', 'EvaluateRules_management', 'Export3DObjects_management', 'ExportAcknowledgementMessage_management', 'ExportAttachments_management', 'ExportAttributeRules_management', 'ExportContingentValues_management', 'ExportDataChangeMessage_management', 'ExportFrameAndCameraParameters_management', 'ExportGeodatabaseConfigurationKeywords_management', 'ExportMosaicDatasetGeometry_management', 'ExportMosaicDatasetItems_management', 'ExportMosaicDatasetPaths_management', 'ExportRasterWorldFile_management', 'ExportReplicaSchema_management', 'ExportReportToPDF_management', 'ExportTileCache_management', 'ExportTopologyErrors_management', 'ExportXMLWorkspaceDocument_management', 'ExtractDataFromGeodatabase_management', 'ExtractPackage_management', 'ExtractSubDataset_management', 'FeatureCompare_management', 'FeatureEnvelopeToPolygon_management', 'FeatureToLine_management', 'FeatureToPoint_management', 'FeatureToPolygon_management', 'FeatureVerticesToPoints_management', 'FieldStatisticsToTable_management', 'FileCompare_management', 'FindIdentical_management', 'Flip_management', 'GenerateAttachmentMatchTable_management', 'GenerateBlockAdjustmentReport_management', 'GenerateExcludeArea_management', 'GenerateFgdbLicense_management', 'GenerateLicensedFgdb_management', 'GenerateMappingTable_management', 'GeneratePointCloud_management', 'GeneratePointsAlongLines_management', 'GenerateRasterCollection_management', 'GenerateRasterFromRasterFunction_management', 'GenerateRectanglesAlongLines_management', 'GenerateSchemaReport_management', 'GenerateTableFromRasterFunction_management', 'GenerateTessellation_management', 'GenerateTileCacheTilingScheme_management', 'GenerateTransectsAlongLines_management', 'GeodeticDensify_management', 'GeoTaggedPhotosToPoints_management', 'GetCellValue_management', 'GetCount_management', 'GetRasterProperties_management', 'Import3DObjects_management', 'ImportAttributeRules_management', 'ImportContingentValues_management', 'ImportGeodatabaseConfigurationKeywords_management', 'ImportMessage_management', 'ImportMosaicDatasetGeometry_management', 'ImportReplicaSchema_management', 'ImportTileCache_management', 'ImportXMLWorkspaceDocument_management', 'Integrate_management', 'InterpolateFromPointCloud_management', 'JoinField_management', 'LasDatasetStatistics_management', 'LasPointStatsAsRaster_management', 'LoadDataToPreview_management', 'LoadDataUsingWorkspace_management', 'MakeAggregationQueryLayer_management', 'MakeBuildingLayer_management', 'MakeFeatureLayer_management', 'MakeImageServerLayer_management', 'MakeLasDatasetLayer_management', 'MakeMosaicLayer_management', 'MakeQueryLayer_management', 'MakeQueryTable_management', 'MakeRasterLayer_management', 'MakeSceneLayer_management', 'MakeTableView_management', 'MakeTinLayer_management', 'MakeTrajectoryLayer_management', 'MakeWCSLayer_management', 'MakeXYEventLayer_management', 'ManageFeatureBinCache_management', 'ManageTileCache_management', 'MatchControlPoints_management', 'MatchLayerSymbologyToAStyle_management', 'MatchPhotosToRowsByTime_management', 'Merge_management', 'MergeMosaicDatasetItems_management', 'MigrateDateFieldToHighPrecision_management', 'MigrateObjectIDTo64Bit_management', 'MigrateRelationshipClass_management', 'MigrateStorage_management', 'MinimumBoundingGeometry_management', 'Mirror_management', 'Mosaic_management', 'MosaicDatasetToMobileMosaicDataset_management', 'MosaicToNewRaster_management', 'MultipartToSinglepart_management', 'Package3DTiles_management', 'PackageLayer_management', 'PackageLocator_management', 'PackageMap_management', 'PackageProject_management', 'PackageResult_management', 'PivotTable_management', 'PointsToLine_management', 'PolygonToLine_management', 'Project_management', 'ProjectRaster_management', 'RasterCompare_management', 'RasterToDTED_management', 'RebuildIndexes_management', 'RecalculateFeatureClassExtent_management', 'ReclassifyField_management', 'ReconcileVersions_management', 'RecoverFileGDB_management', 'ReExportUnacknowledgedMessages_management', 'RefreshExcel_management', 'RegisterAsVersioned_management', 'RegisterRaster_management', 'RegisterWithGeodatabase_management', 'Remove3DFormats_management', 'RemoveAttachments_management', 'RemoveContingentValue_management', 'RemoveDepthMap_management', 'RemoveDomainFromField_management', 'RemoveFeatureClassFromTopology_management', 'RemoveFieldConflictFilter_management', 'RemoveFilesFromLasDataset_management', 'RemoveIndex_management', 'RemoveJoin_management', 'RemoveRastersFromMosaicDataset_management', 'RemoveRelate_management', 'RemoveRuleFromRelationshipClass_management', 'RemoveRuleFromTopology_management', 'RemoveSpatialIndex_management', 'RemoveSubtype_management', 'Rename_management', 'ReorderAttributeRule_management', 'RepairGeometry_management', 'RepairMosaicDatasetPaths_management', 'RepairTrajectoryDatasetPaths_management', 'RepairVersionMetadata_management', 'RepairVersionTables_management', 'Resample_management', 'Rescale_management', 'Rotate_management', 'SaveToLayerFile_management', 'SaveToolboxToVersion_management', 'SelectLayerByAttribute_management', 'SelectLayerByLocation_management', 'SetClusterTolerance_management', 'SetDefaultSubtype_management', 'SetFeatureClassSplitModel_management', 'SetMosaicDatasetProperties_management', 'SetRasterProperties_management', 'SetRelationshipClassSplitPolicy_management', 'SetSubtypeField_management', 'SetValueForRangeDomain_management', 'SharePackage_management', 'Shift_management', 'Sort_management', 'SortCodedValueDomain_management', 'SplitLine_management', 'SplitLineAtPoint_management', 'SplitMosaicDatasetItems_management', 'SplitRaster_management', 'StandardizeField_management', 'SubdividePolygon_management', 'SubsetFeatures_management', 'SynchronizeChanges_management', 'SynchronizeMosaicDataset_management', 'TableCompare_management', 'TableToDomain_management', 'TableToEllipse_management', 'TableToRelationshipClass_management', 'TINCompare_management', 'TransferFiles_management', 'TransformField_management', 'TransposeFields_management', 'TrimArchiveHistory_management', 'TruncateTable_management', 'UncompressFileGeodatabaseData_management', 'UnregisterAsVersioned_management', 'UnregisterReplica_management', 'UnsplitLine_management', 'UpdateDataLoadingWorkspace_management', 'UpdateEnterpriseGeodatabaseLicense_management', 'UpdateGeodatabaseConnectionPropertiesToBranch_management', 'UpdateInteriorOrientation_management', 'UpdatePortalDatasetOwner_management', 'UpgradeAttachments_management', 'UpgradeDataset_management', 'UpgradeGDB_management', 'UpgradeSceneLayer_management', 'UploadFileToPortal_management', 'ValidateJoin_management', 'ValidateSceneLayerPackage_management', 'ValidateTopology_management', 'Warp_management', 'WarpFromFile_management', 'WorkspaceToRasterDataset_management', 'XYTableToPoint_management', 'XYToLine_management']
In [32]:
del tools
In [36]:
# Search within a specific toolbox
print(arcpy.ListTools("*_analysis"))  # Get tools only from the Analysis Toolbox
['ApportionPolygon_analysis', 'Buffer_analysis', 'Clip_analysis', 'CountOverlappingFeatures_analysis', 'CreateThiessenPolygons_analysis', 'Enrich_analysis', 'Erase_analysis', 'Frequency_analysis', 'GenerateNearTable_analysis', 'GenerateOriginDestinationLinks_analysis', 'GraphicBuffer_analysis', 'Identity_analysis', 'Intersect_analysis', 'MultipleRingBuffer_analysis', 'Near_analysis', 'PairwiseBuffer_analysis', 'PairwiseClip_analysis', 'PairwiseDissolve_analysis', 'PairwiseErase_analysis', 'PairwiseIntegrate_analysis', 'PairwiseIntersect_analysis', 'PolygonNeighbors_analysis', 'RemoveOverlapMultiple_analysis', 'Select_analysis', 'SpatialJoin_analysis', 'Split_analysis', 'SplitByAttributes_analysis', 'Statistics_analysis', 'SummarizeNearby_analysis', 'SummarizeWithin_analysis', 'SymDiff_analysis', 'TableSelect_analysis', 'TabulateIntersection_analysis', 'Union_analysis', 'Update_analysis']
In [37]:
# Search by specific tool name
print(arcpy.ListTools("Buffer_*")) # Get tools that start with the word Buffer
['Buffer_analysis']

Creating Geodatabases to manage data in ArcGIS Pro

arcpy.management.CreateFileGDB()

  • arcpy.management.CreateFileGDB(out_folder_path, out_name, {out_version})
    • Geodatabases are the way project files are structured and managed in ArcGIS Pro
    • Collection of geographic datasets held in a common folder
  • Use-case: creating, managing and editing datasets
In [107]:
##========================  Create a Geodatabase to store project data

## TOOL: arcpy.management.CreateFileGDB(out_folder_path, out_name, {out_version})

# folder path
gdb_folder = r'C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\02 Result'

if not os.path.exists(gdb_folder):
    os.makedirs(gdb_folder)

#params, path
gdb_name = 'MN_EnvironmentalData.gdb'
gdb_path = os.path.join(gdb_folder,gdb_name)

# Create gdb if it does not exist
if not arcpy.Exists(gdb_path):
    arcpy.management.CreateFileGDB(gdb_folder,gdb_name)
else: 
    print("GDB already exists", gdb_path)
GDB already exists C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\02 Result\MN_EnvironmentalData.gdb

Define Coordinate System by creating a spatial reference object

arcpy.SpatialReference()

  • Use-case: changing or defining the CS for a new dataset
  • multiple ways you can specify a coordinate system

  • Note the general pattern: library.class.attribute

    • accessing attributes of class gets you more info
    • sr = arcpy.SpatialReference(4326)
    • sr.name
In [61]:
##========================  Create a Spatial Reference object to define Coordinate Systems

### TOOL: arcpy.SpatialReference(item, {vcs}, {text})


# Input Data USGC Minnesota Wells : https://nwis.waterdata.usgs.gov/nwis/gwlevels?search_criteria=state_cd&submitted_form=introduction
# Metadata Shows CS is NAD83

wkid = 4269 # GCS_North_American_1983

sr= arcpy.SpatialReference(wkid) # Can pass in a Geographic or Projected CS (e.g., could have passed in wkid 26915 for PCS NAD83 UTM Zone 15N)

print("\nName of CS: ",sr.name, " \nType of CS: ", sr.type, "\nDatum name: " , sr.datumName, "\nUnits are : ", sr.angularUnitName)
Name of CS:  GCS_North_American_1983  
Type of CS:  Geographic 
Datum name:  D_North_American_1983 
Units are :  Degree

Using Geoprocessing Tools in Python

  • Geoprocessing tools are called using this general structure: arcpy.module.ToolName
    • module is pythons way of accessing a toolbox as seen in ArcGIS
  • arcpy.management.XYTableToPoint
    • arcpy.management.XYTableToPoint(in_table, out_feature_class, x_field, y_field, {z_field}, {coordinate_system})
    • Optional parameters have curly braces { }
    • Skip optional ones with double quotes "" or specify only those optional parameters you are using

```python

                                                                ## Optional Parameter specified

arcpy.management.XYTableToPoint(intable, outfeature, x_field, y_field, coordinate_system = arcpy.SpatialReference(4326))

-  Note: 4326 is the well-known ID (WKID) for WGS 1984
In [108]:
##========================  Load in example wells csv file data and check its datatypes

# note: to identify x, y fields and CS, open the file or load into python (pandas or similar) 
import pandas as pd 
input_data = r'C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\01 Data\MN_Water_wells_rft.csv'

# Read CSV using tab as the delimiter
df = pd.read_csv(input_data, delimiter=",", encoding="utf-8") # ensure file is comma-delimited, we need  comma-delimited for arcpy to work
print(df.dtypes)
print()
print(df.columns)
agency_cd              object
site_no                 int64
station_nm             object
site_tp_cd             object
dec_lat_va            float64
dec_long_va           float64
coord_acy_cd           object
dec_coord_datum_cd     object
county_cd               int64
map_nm                 object
well_depth_va         float64
dtype: object

Index(['agency_cd', 'site_no', 'station_nm', 'site_tp_cd', 'dec_lat_va',
       'dec_long_va', 'coord_acy_cd', 'dec_coord_datum_cd', 'county_cd',
       'map_nm', 'well_depth_va'],
      dtype='object')
In [109]:
##========================   Reformat datatypes to match the expected format


#Convert 'int64' fields to  float' (ArcPy does not support int64)
df["site_no"] = df["site_no"].astype("float64")  # Converts int64 to float64
df["county_cd"] = df["county_cd"].astype("float64")

# save to csv with fixed formatting
df.to_csv(r"01 Data\MN_Water_wells_fixed.csv", index=False,  sep=',', encoding="utf-8")
In [114]:
##========================  Change the working directory to the GDB

# set the folder path as the working environment, and store it for later use
try:
    arcpy.env.workspace = gdb_path
    wksp = arcpy.env.workspace ## optionally, update the path in wksp variable
except Exception as e:
    print(f"Error setting up the path, {e}, Exception type: {type(e).__name__}") # python's built-in error messaging, {e} prints description and type(e).__name__ category

print("Working environment is here",wksp)

##========================  Convert spreadsheet data to point a feature class based on x,y coordinates from a table

### TOOL: arcpy.management.XYTableToPoint(in_table, out_feature_class, x_field, y_field, {z_field}, {coordinate_system})

# params
input_data = r'C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\01 Data\MN_Water_wells_fixed.csv'
output_data= "MN_Wells"
y_coords =  'dec_lat_va'
x_coords =  'dec_long_va' # ensure longitude is x


# try/except to handle errors

try:                                                                                # argument (sr) passed in with the key word (coordinate_system)
    arcpy.management.XYTableToPoint(input_data, output_data, x_coords, y_coords, coordinate_system=sr)
except Exception as e:
    print("Error converting table" ,e)
Working environment is here C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\02 Result\MN_EnvironmentalData.gdb
In [120]:
## Check if the new feature exists in the gdb

fc_list = arcpy.ListFeatureClasses() # will list all features in the current workspace (i.e, our gdb), can specify feature_type e.g., Point.


for i, fc in enumerate(fc_list):
    print(i, fc, fc )
MN_Wells

Define Spatial Reference

arcpy.management.DefineProjection(input_fc, arcpy.SpatialReference())

  • Use-case: Assign a spatial reference to a dataset
In [ ]:
input_fc= "MN_Wells"
wkid= 4269  # GCS_North_American_1983
sr = arcpy.SpatialReference(wkid)
out_cs = sr
arcpy.management.DefineProjection(input_fc, out_cs)
In [122]:
# Access properties for the feature class

fc_desc = arcpy.Describe(fc)

fc_desc.spatialReference
Out[122]:
name (Geographic Coordinate System)GCS_North_American_1983
factoryCode (WKID)4269
angularUnitName (Angular Unit)Degree
datumName (Datum)D_North_American_1983
In [155]:
fc_desc.shapeType
Out[155]:
'Point'

Batch Creation of Multiple Tables to Point features

  • use-case: multiple new data inputs that need to converted to points

Steps:

  1. Check Coordinate Systems match
  2. Project to common CS, if needed
  3. Check x,y (lat/lon) columns, re-format
  4. Convert tables to points

Data Sources :

  • Geography: MN
Input Name Coordinate System Source
Water Well locations GCS: NAD 83 CSV output
What's in my neighborhood sites PCS: NAD 83 UTM Zone 15N MetaData
MN pollution control activity sites PCS: NAD 83 UTM Zone 15N MetaData
PFAS levels GCS: NAD 83 UTM Zone 15N Report
Ground Water contamination sites PCS: NAD 83 UTM Zone 15N MetaData
  • All are CSV file type
  • Two input datasets are different, we will demonstrate how to reproject the GCS to the same PCS as the others
In [161]:
###====================  Project spatial data from one coordinate system to another


## TOOL: arcpy.management.Project(in_dataset, out_dataset, out_coor_system, {transform_method}, {in_coor_system}, {preserve_shape}, {max_deviation}, {vertical})

# params
input_data = "MN_Wells"
out_proj= 'MN_Wells_repojected'
wkid_proj = 26915
out_cs = arcpy.SpatialReference(wkid_proj)

print("\nName of CS: ",out_cs.name, " \nType of CS: ", out_cs.type , "\nUnits are : ", out_cs.linearUnitName)

try:

    arcpy.management.Project(input_data, out_proj,out_cs)
    
except Exception as e:
    print("Error projecting the data", e)
Name of CS:  NAD_1983_UTM_Zone_15N  
Type of CS:  Projected 
Units are :  Meter
In [ ]:
# check that new projected fc exists in the workspace
arcpy.ListFeatureClasses()
Out[ ]:
['MN_Wells', 'MN_Wells_repojected']
In [231]:
##=====================  Select and create a list of the tables to be processed

input_folder_path = r'C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data'

new_file_list = set()

# use Walk to list out the names of the files in the folder
for dirpath, dirnames, filenames in arcpy.da.Walk(input_folder_path): # returns folder path, folders, and file names. Is a tuple of folder path, folder names, and files
    print("Walking through this folder", dirpath)
    for file in filenames:
        if not file.startswith("MN_Water"):
            print(file)
            full_path = os.path.join(input_folder_path,file)
            new_file_list.add(full_path)



print("Compiled input files", new_file_list)
Walking through this folder C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data
MN_GroundWater_ContaminationSites.csv
MN_PFAS_levels.csv
mpca_agency_interests.csv
my_neighborhood_sites.csv
Walking through this folder C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data\metadata
Compiled input files {'C:\\Users\\beste\\Desktop\\ArcPy\\what_is_arcpy\\01 Data\\MN_PFAS_levels.csv', 'C:\\Users\\beste\\Desktop\\ArcPy\\what_is_arcpy\\01 Data\\MN_GroundWater_ContaminationSites.csv', 'C:\\Users\\beste\\Desktop\\ArcPy\\what_is_arcpy\\01 Data\\my_neighborhood_sites.csv', 'C:\\Users\\beste\\Desktop\\ArcPy\\what_is_arcpy\\01 Data\\mpca_agency_interests.csv'}
In [232]:
# check columns are same for x,y fields, reformat them if not

for file in new_file_list:
    df= pd.read_csv(file)
    df.columns = df.columns.str.lower().str.strip() #Initially, Lat , Long casing was different, reformat all to be lowercase, trim extra spaces
    print("File path: ", file,"\n" ,df.columns)
    df.to_csv(file, index= False) # prevents extra index column being added
File path:  C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data\MN_PFAS_levels.csv 
 Index(['group', 'program', 'preferred id', 'facility name', 'sys loc code',
       'loc name', 'loc type', 'latitude', 'longitude', 'sys sample code',
       'sample date', 'medium', 'matrix', 'analyte', 'analyte short name',
       'cas', 'detect flag', 'result text', 'result numeric', 'unit',
       'lab qualifiers', 'detect description', 'method detection limit',
       'reporting detection limit', 'approval code', 'fraction',
       'dilution factor', 'result type code', 'analytic method', 'ai id',
       'ai name', 'address', 'city', 'state', 'zip code', 'county'],
      dtype='object')
File path:  C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data\MN_GroundWater_ContaminationSites.csv 
 Index(['id', 'name', 'preferredid', 'description', 'county', 'municipality',
       'address', 'zip', 'state', 'longitude', 'latitude', 'acreage',
       'typedescription'],
      dtype='object')
File path:  C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data\my_neighborhood_sites.csv 
 Index(['objectid', 'site_id', 'name', 'active_flag', 'address_street',
       'address_city', 'address_state', 'address_zip', 'city', 'county',
       'cong_district', 'senate_district', 'house_district', 'huc8',
       'watershed_name', 'site_url', 'activity', 'activity_list', 'mpca_id',
       'mpca_id_list', 'program_code_list', 'program_name',
       'program_name_list', 'industrial_classification', 'ic_flag', 'latitude',
       'longitude', 'coord_collect_method_code', 'coord_collect_method_name',
       'shape', 'gdb_geomattr_data'],
      dtype='object')
C:\Users\beste\AppData\Local\Temp\ipykernel_33536\470595180.py:4: DtypeWarning: Columns (9,10) have mixed types. Specify dtype option on import or set low_memory=False.
  df= pd.read_csv(file)
File path:  C:\Users\beste\Desktop\ArcPy\what_is_arcpy\01 Data\mpca_agency_interests.csv 
 Index(['objectid', 'item_id', 'ai_id', 'int_doc_id', 'si_type', 'si_cat',
       'si_id', 'si_cat_desc', 'si_type_desc', 'description', 'si_designation',
       'name', 'program', 'program_list', 'address', 'city', 'state', 'zip',
       'county_code', 'county', 'ctu_code', 'ctu_name', 'cong_dist',
       'house_dist', 'senate_dist', 'huc8', 'huc8_name', 'huc10', 'huc10_name',
       'huc12', 'huc12_name', 'dwsma_code', 'dwsma_name', 'loc_desc', 'trdsqq',
       'pls_twsp', 'range', 'range_dir', 'section', 'quarters', 'latitude',
       'longitude', 'method_code', 'method_desc', 'ref_code', 'ref_desc',
       'verified', 'collection_date', 'tmsp_created', 'user_created',
       'tmsp_updt', 'user_updt', 'status', 'status_date', 'spatial_id',
       'shape'],
      dtype='object')
In [233]:
##========================  Convert Multiple spreadsheets to point a feature class based on x,y coordinates

# verify the workspace where points will be created
arcpy.env.workspace = gdb_path
arcpy.env.overwriteOutput=True
print("Gdb path set to: ", gdb_path)

# params
y_coords =  'latitude'
x_coords =  'longitude' # ensure longitude is x


##=======================    Loop through files one-by-one and convert them to points
counter= 0
# try/except to handle errors
for file in new_file_list:
    input_data = file
    output_name = os.path.splitext(os.path.basename(file))[0] # get the file name , split extension from file name, and select the file name
    print(output_name)
    output_data = os.path.join(gdb_path,output_name)
    try:                                                                                # argument (sr) passed in with the key word (coordinate_system)
        arcpy.management.XYTableToPoint(input_data, output_name, x_coords, y_coords, coordinate_system=sr)
        print(f"Successfully Converted {output_name} to points")
        counter+=1
    except Exception as e:
        print(f"Error converting table {output_name}" ,e)


print(f"Finished Converting {counter} tables to points")
Gdb path set to:  C:\Projects\my_git_pages_website\Py-and-Sky-Labs\content\ArcPY\02 Result\MN_EnvironmentalData.gdb
MN_PFAS_levels
Successfully Converted MN_PFAS_levels to points
MN_GroundWater_ContaminationSites
Successfully Converted MN_GroundWater_ContaminationSites to points
my_neighborhood_sites
Successfully Converted my_neighborhood_sites to points
mpca_agency_interests
Successfully Converted mpca_agency_interests to points
Finished Converting 4 tables to points
In [234]:
# check the features that exist in the workspace
arcpy.ListFeatureClasses()
Out[234]:
['MN_Wells',
 'MN_Wells_repojected',
 'MN_PFAS_levels',
 'MN_GroundWater_ContaminationSites',
 'my_neighborhood_sites',
 'mpca_agency_interests']

links

social