Introduction to List Comprehensions¶

Definition
- List comprehensions are a concise and efficient way to create lists in Python
- They provide a syntactically elegant method to perform operations and apply conditions to iterables, allowing the creation or transformation of lists in a single line of code

Why Do We Use List Comprehensions?:

Conciseness: Reduces the amount of code needed compared to traditional loops, making the code cleaner and easier to read
Performance: Generally faster than equivalent for loops due to optimized implementation and reduced overhead in Python
Expressiveness: Allows the code to be more descriptive and focused on the operation itself, rather than the mechanics of looping and appending to lists
Versatility: Capable of incorporating conditional logic within the list creation, which lets you filter elements or apply complex transformations easily
Key point: Use List comprehensions to transform or extract data

Syntax:

Basic Structure: A list comprehension consists of brackets containing an expression followed by a for clause
Optionally, it can include one or more for or if clauses.

Generalized Examples:¶

[expression for item in iterable]
[expression for item in iterable if condition]
[expression for item in iterable if condition1 if condition2]
[expression for item in iterable1 for item2 in iterable2]

In [60]:

import pandas as pd
import os

Basic List Comprehension¶

In [61]:

# basic list comprehension for squaring numbers and creating a list

# Basic components of list comprehension
# (1) expression
# (2) item
# (3) iterable
#         (1)      (2)   (3)
squares = [x**2 for x in range(10)]

print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [62]:

# What happenned?
# The list comprehension iterated over the range of numbers from 0 to 9, squaring each number and storing it in a list

Conditional List Comprehension¶

In [63]:

# Generate a list of even numbers between 1 and 21
#  list comprehension with a condition

# Basic components of list comprehension
# (1) expression
# (2) item
# (3) iterable
# (4) condition

#         (1)    (2)    (3)       (4)
evens = [x**2 for x in range(1,21) if x %2==0]

print(evens)

# print(sqrt{evens})

[4, 16, 36, 64, 100, 144, 196, 256, 324, 400]

In [64]:

# List comprehension that filters results based on membership in predefined list
magic_nums = [1,2,7,8]

mylist_magnum = [x**2 for x in range(10) if x in magic_nums]

print(magic_nums)

print(mylist_magnum)

[1, 2, 7, 8]
[1, 4, 49, 64]

In [65]:

# Create a list of tuples with numbers and their squares

numbers_and_squares = [(x,x**2) for x in range(10)]

print(numbers_and_squares)

numbers_and_squares.append((10,100))

print(numbers_and_squares)

# use pop () method to remove and store an item

# remove item at index 0 and store it in a variable
one_item =numbers_and_squares.pop(0)
print(one_item)
print(numbers_and_squares)

[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]
(0, 0)
[(1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]

In [66]:

# Create a list of tuples with numbers and their squares

numbers_and_squares = [(x,x**2) for x in range(10)]

print(numbers_and_squares)

numbers_and_squares.append((10,100))

print(numbers_and_squares)

# use delete statement to remove items without storing them
# can accept a slice to delete a range of items

# remove item at index 0 without storing it
del numbers_and_squares[0]

print(numbers_and_squares)

[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]
[(1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]

In [67]:

# Create a list of tuples with numbers and their squares

numbers_and_squares = [(x,x**2) for x in range(10)]

print(numbers_and_squares)

numbers_and_squares.append((10,100))

print(numbers_and_squares)

# Remove items at indices 0 to 4 (inclusive of 0, exclusive of 5)
del numbers_and_squares[0:5]

print(numbers_and_squares)  # Output: [(5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]

[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]
[(5, 25), (6, 36), (7, 49), (8, 64), (9, 81), (10, 100)]

Extracting data using list comprehensions¶

In [68]:

# Extract values that meet a certain criteria

# Basic components of list comprehension
# (1) expression
# (2) item
# (3) iterable
# (4) condition

test_scores= [50,60,65,98,91,85,100]

#                 (1)   (2)   (3)            (4)
passing_grades = [x for x in test_scores if x >60]

print(passing_grades)

[65, 98, 91, 85, 100]

In [69]:

#### Extract a single item
# generalized form for extracting single elements from a list based on criteria
""" list_data = [element['key1'] for element in list if element['key2']>x]"""

# List comprehension to extract names from a list of dictionaries
# each dictionary is a person with a name and age
people_data = [
    {'name': 'John', 'age': 28},
    {'name': 'Anna', 'age': 20},
    {'name': 'James', 'age': 18},
    {'name': 'Linda', 'age': 30}
]

adults_info = [person['name'] for person in people_data if person['age']>21]

print(adults_info)

['John', 'Linda']

In [70]:

del adults_info[0]
print(adults_info)

['Linda']

In [71]:

# List comprehension to extract tuple of names and age from a list of dictionaries
# each dictionary is a person with a name and age
people_data = [
    {'name': 'John', 'age': 28},
    {'name': 'Anna', 'age': 20},
    {'name': 'James', 'age': 18},
    {'name': 'Linda', 'age': 30}
]

adults_info = [(person['name'], person['age']) for person in people_data if person['age']>21]

print(adults_info)

[('John', 28), ('Linda', 30)]

In [72]:

del adults_info[0]
print(adults_info)

[('Linda', 30)]

In [73]:

# List comprehension to extract tuple of names and age from a list of dictionaries
# each dictionary is a person with a name and age
people_data = [
    {'name': 'John', 'age': 28},
    {'name': 'Anna', 'age': 20},
    {'name': 'James', 'age': 18},
    {'name': 'Linda', 'age': 30}
]

adults_info = [(person['name'], person['age']) for person in people_data if person['age']>21]

print(adults_info)

[('John', 28), ('Linda', 30)]

In [74]:

first_adult=adults_info.pop(0)

print(first_adult)

print(adults_info)

adults_info.append(first_adult)

print(adults_info)

('John', 28)
[('Linda', 30)]
[('Linda', 30), ('John', 28)]

Understanding DataFrame Iteration with `iterrows()`¶

1) Introduction to DataFrame Iteration

Why Iteration? Iteration over DataFrames is commonly needed when each row of data must be processed individually
While vectorized operations are preferred for performance, iteration is useful for complex operations that aren't easily vectorized or when debugging row by row. 2) Using iterrows():
Definition: iterrows() is a generator that iterates over the rows of a DataFrame
It allows you to loop through each row of the DataFrame, with the row returned as a Series object
yields a tuple for each row in the DataFrame as index, series pairs

Syntax:

index: Represents the index of the row in the DataFrame
row: A Series containing the row data
iterrows(): a generator that iterates over the rows
row['column_name']: Accesses data in a specific column for that row

Example:¶

``` for index, row in df.iterrows(): row['column_name']

Data cleaning with `iterrows()`¶

In [75]:

# iterrating over rows with iterrows():


# Create a sample dataframe
data = {'Name': ['John', 'Anna', 'James'], 'Age': ['28',22,'35a']}
sdf= pd.DataFrame(data)

# iterate over the dataframe
# for each row number , extract the row data in the dataframe, repeat for all rows
for index,row in sdf.iterrows():
    # strip leading/trailing white space from the name
    sdf.at[index, 'Name'] = row['Name'].strip()
    # Check if the row 'Age' is a string
    if isinstance(row['Age'], str):
        print(f"String data index is {index}, Name is {row['Name']}, Age is {row['Age']}")
        # Boolean check if expected numeric data was entered as strings, if contains letters or special characters return False
        if row['Age'].isdigit():
            sdf.at[index,'Age'] = int(row['Age']) # True,if is a digit from 0-9, then convert to integer
            print(f"Cleaned data index is {index}, Name is {row['Name']}, Age is {row['Age']}")
        else:
            sdf.at[index, 'Age'] = pd.NA
            print(f"Uncleaned data index is {index}, Name is {row['Name']}, Age is {row['Age']}")

String data index is 0, Name is John, Age is 28
Cleaned data index is 0, Name is John, Age is 28
String data index is 2, Name is James, Age is 35a
Uncleaned data index is 2, Name is James, Age is <NA>

In [76]:

import pandas as pd

# Create a sample dataframe
data = {
    'UserName': ['JohnDoe', 'AnnaSmith', 'JamesBond', 'JohnDoe', 'AnnaSmith'],
    'UniqueID': [101, 102, 103, 101, 102],
    'State': ['NY', 'CA', 'TX', 'NY', 'CA']
}
df = pd.DataFrame(data)


# initiate an empty set that will only hold unique IDs
unique_userId= set()

# iterate over the dataframe
for index, row in df.iterrows():

    # Check if the current rows uniqueID is already in the set
    if row['UniqueID'] in unique_userId:
        
        # At this row, Create a new column, mark as a duplicate
        df.at[index, 'Duplicates'] = True

    else: 
        # Add the current row unique ID to the set
        unique_userId.add(row['UniqueID'])
        # Mark the row as False for Duplicates
        df.at[index, 'Duplicates']= False

print(df)

    UserName  UniqueID State Duplicates
0    JohnDoe       101    NY      False
1  AnnaSmith       102    CA      False
2  JamesBond       103    TX      False
3    JohnDoe       101    NY       True
4  AnnaSmith       102    CA       True

In [77]:

# remove duplicate

df= df[df['Duplicates']==False]

print(df)

    UserName  UniqueID State Duplicates
0    JohnDoe       101    NY      False
1  AnnaSmith       102    CA      False
2  JamesBond       103    TX      False

In [78]:

df=df.drop(columns='Duplicates')

print(df)

    UserName  UniqueID State
0    JohnDoe       101    NY
1  AnnaSmith       102    CA
2  JamesBond       103    TX

In [79]:

#duplicated

# Create a sample dataframe
data = {
    'UserName': ['JohnDoe', 'AnnaSmith', 'JamesBond', 'JohnDoe', 'AnnaSmith'],
    'UniqueID': [101, 102, 103, 101, 102],
    'State': ['NY', 'CA', 'TX', 'NY', 'CA']
}
df = pd.DataFrame(data)

# duplciates method used to identify duplicates
# subset paratemeter= specifies the column to check for dups
# keep parameter, keeps the first occurence and marks subsquent duplicates

df['Duplicates']= df.duplicated(subset='UniqueID', keep='first')
print(df)

    UserName  UniqueID State  Duplicates
0    JohnDoe       101    NY       False
1  AnnaSmith       102    CA       False
2  JamesBond       103    TX       False
3    JohnDoe       101    NY        True
4  AnnaSmith       102    CA        True

In [ ]:

In [80]:

df= df[~df['Duplicates']==True]
print(df)

    UserName  UniqueID State  Duplicates
0    JohnDoe       101    NY       False
1  AnnaSmith       102    CA       False
2  JamesBond       103    TX       False

In [81]:

df=df.drop(columns='Duplicates')

print(df)

    UserName  UniqueID State
0    JohnDoe       101    NY
1  AnnaSmith       102    CA
2  JamesBond       103    TX

In [82]:

#duplicated

# Create a sample dataframe
data = {
    'UserName': ['JohnDoe', 'AnnaSmith', 'JamesBond', 'JohnDoe', 'AnnaSmith'],
    'UniqueID': [101, 102, 103, 101, 102],
    'State': ['NY', 'CA', 'TX', 'NY', 'CA']
}
df = pd.DataFrame(data)

# Count duplicates
# Group by the column you want to check for duplicates
# Use transform('size') to get the count of each group
# Assign result to a new column

df['Counts']= df.groupby('UniqueID')['UniqueID'].transform('size')

print(df)

    UserName  UniqueID State  Counts
0    JohnDoe       101    NY       2
1  AnnaSmith       102    CA       2
2  JamesBond       103    TX       1
3    JohnDoe       101    NY       2
4  AnnaSmith       102    CA       2

In [83]:

# Create a sample dataframe
data = {
    'UserName': ['JohnDoe', 'AnnaSmith', 'JamesBond', 'JohnDoe', 'AnnaSmith'],
    'UniqueID': [101, 102, 103, 101, 102],
    'State': ['NY', 'CA', 'TX', 'NY', 'CA'],
    'Sales': [200, 150, 300, 250, 100]
}
df = pd.DataFrame(data)

# Normalize sales within each UniqueID group
df['NormalizedSales'] = df.groupby('UniqueID')['Sales'].transform(lambda x: (x - x.mean()) / x.std())

print(df)

    UserName  UniqueID State  Sales  NormalizedSales
0    JohnDoe       101    NY    200        -0.707107
1  AnnaSmith       102    CA    150         0.707107
2  JamesBond       103    TX    300              NaN
3    JohnDoe       101    NY    250         0.707107
4  AnnaSmith       102    CA    100        -0.707107

Multi-column conditional flagging of rows with `itterows()`¶

In [84]:

# Create a sample dataframe
data = {'Name': ['John', 'Anna', 'James'], 'Age': [28,22,35]}
sdf= pd.DataFrame(data)

for index, row in sdf.iterrows():
    if row['Age'] < 30 and "J" in row['Name']:
        sdf.at[index, 'Category'] = 'Young J'
    else:
        sdf.at[index, 'Category'] = 'Other'

print(sdf)

    Name  Age Category
0   John   28  Young J
1   Anna   22    Other
2  James   35    Other

Data transformation with `itterows()`¶

In [85]:

for index, row in sdf.iterrows():
    sdf.at[index, 'New Age'] = row['Age']+10 # add 10 years to each persons age

print(sdf)

    Name  Age Category  New Age
0   John   28  Young J     38.0
1   Anna   22    Other     32.0
2  James   35    Other     45.0

Multi-column Conditional Flagging or Computation with `itterows()`¶

In [86]:

for index, row in sdf.iterrows():
    if row['Age'] < 30 and "J" in row['Name']:
        sdf.at[index, 'Category'] = 'Young J'
    else:
        sdf.at[index, 'Category'] = 'Other'

print(sdf)

    Name  Age Category  New Age
0   John   28  Young J     38.0
1   Anna   22    Other     32.0
2  James   35    Other     45.0

In [87]:

for index,row in sdf.iterrows():
    if row['Age']>30 and 'a' in row['Name']:
        sdf.at[index, 'Flag'] = True

    else:
        sdf.at[index, 'Flag']= False
print(sdf)

    Name  Age Category  New Age   Flag
0   John   28  Young J     38.0  False
1   Anna   22    Other     32.0  False
2  James   35    Other     45.0   True

Mark specific rows with `itterows()`¶

In [88]:

for index, row in sdf.iterrows():
    if row['Name'].startswith('J') and row['Age']>25:
        sdf.at[index, 'Status'] = 'Senior J'

print(sdf)

    Name  Age Category  New Age   Flag    Status
0   John   28  Young J     38.0  False  Senior J
1   Anna   22    Other     32.0  False       NaN
2  James   35    Other     45.0   True  Senior J

Combine List Comprehension and iterrows() to extract a specific list from a dataframe¶

In [89]:

# Load a sample dataset to demonstrate application of list comprehsion and itterows()
# this is a dataset for tornado occuring in the state of minnesota in recent years

df = pd.read_csv(r".\Data\storm_data_search_results.csv")
# Set the option to display all columns
pd.set_option('display.max_columns', None)

df.head()

Out[89]:

	EVENT_ID	CZ_NAME_STR	BEGIN_LOCATION	BEGIN_DATE	BEGIN_TIME	EVENT_TYPE	TOR_F_SCALE	DAMAGE_PROPERTY_NUM	STATE_ABBR	CZ_TIMEZONE	EPISODE_ID	CZ_TYPE	CZ_FIPS	WFO	SOURCE	TOR_LENGTH	TOR_WIDTH	BEGIN_RANGE	BEGIN_AZIMUTH	END_RANGE	END_AZIMUTH	END_LOCATION	END_DATE	END_TIME	BEGIN_LAT	BEGIN_LON	END_LAT	END_LON	EVENT_NARRATIVE	EPISODE_NARRATIVE	ABSOLUTE_ROWNUMBER
0	626306	POPE CO.	VILLARD	05/25/2016	1410	Tornado	EF0	10000	MN	CST-6	104565	C	121	MPX	Law Enforcement	0.16	25	2	SSW	1	SSW	VILLARD	05/25/2016	1412	45.6989	-95.2829	45.7000	-95.2800	A few boats were flipped, a shed was damaged a...	An Isolated but severe thunderstorm developed ...	1
1	626307	STEARNS CO.	ST ANTHONY	05/25/2016	1709	Tornado	EF0	15000	MN	CST-6	104565	C	145	MPX	Trained Spotter	3.30	25	1	NE	3	SSE	ST FRANCIS	05/25/2016	1715	45.6894	-94.6042	45.7298	-94.5674	A trained spotter video taped a tornado near H...	An Isolated but severe thunderstorm developed ...	2
2	629201	RED LAKE CO.	OKLEE	05/27/2016	1314	Tornado	EF0	0	MN	CST-6	104632	C	125	FGF	Law Enforcement	0.05	50	2	WNW	2	WNW	OKLEE	05/27/2016	1315	47.8400	-95.9200	47.8400	-95.9200	Two funnel clouds were noted between Brooks an...	Morning sunshine and moisture from recent rain...	3
3	629205	CLAY CO.	MOORHEAD ARPT	05/27/2016	1357	Tornado	EF0	0	MN	CST-6	104632	C	27	FGF	Storm Chaser	0.05	75	2	SSW	2	SSW	MOORHEAD ARPT	05/27/2016	1358	46.8200	-96.7000	46.8200	-96.7000	Evidence from photographs and video indicate t...	Morning sunshine and moisture from recent rain...	4
4	629206	CLAY CO.	GLYNDON	05/27/2016	1401	Tornado	EF0	0	MN	CST-6	104632	C	27	FGF	Broadcast Media	0.05	50	2	NNW	2	NNW	GLYNDON	05/27/2016	1402	46.9000	-96.6000	46.9000	-96.6000	A brief touchdown was noted in a photo and rep...	Morning sunshine and moisture from recent rain...	5

In [90]:

# List comprehension to extract coordinates of EF0 tornado events
ef0_coordinates = [(row['BEGIN_LAT'], row['BEGIN_LON']) for index, row in df.iterrows() if row['TOR_F_SCALE'] == 'EF0']
print(ef0_coordinates)

[(45.6989, -95.2829), (45.6894, -94.6042), (47.84, -95.92), (46.82, -96.7), (46.9, -96.6), (45.5488, -94.808), (44.1, -96.3105), (43.9812, -96.3452), (43.9589, -94.2033), (44.1973, -93.5417), (44.2106, -93.5253), (44.3134, -93.4661), (44.1282, -93.8622), (45.51, -96.64), (45.56, -96.6), (45.61, -96.53), (48.91, -95.72), (46.5, -94.79), (46.496, -94.7789), (45.2445, -95.9863), (45.2506, -95.8705), (44.3938, -92.9217), (45.3265, -94.4055), (44.9472, -94.026), (43.762, -93.2111), (47.62, -96.58), (47.54, -96.51), (47.8878, -94.7866), (43.8482, -93.2348), (44.1631, -92.1789), (44.1286, -92.2541), (45.4433, -95.8244), (45.79, -95.8), (46.34, -96.54), (46.3, -96.29), (44.0561, -92.3083), (43.9989, -92.1358), (43.998, -92.3098), (44.3147, -94.3519), (45.3784, -93.2268), (45.3072, -93.0197), (45.3229, -92.8595), (45.1811, -92.8581), (46.1529, -94.9219), (45.9763, -95.5803), (45.9815, -95.3762), (44.3831, -95.8188), (44.3921, -95.78), (44.1615, -95.0339), (46.81, -96.58), (47.4506, -94.2105), (47.4341, -94.2526), (47.3999, -94.2862), (44.2279, -94.1762), (44.244, -94.1705), (44.3939, -94.1851), (44.5377, -94.2398), (44.5654, -94.2676), (44.5359, -93.5852), (44.5436, -93.59), (44.694, -94.5097), (44.7246, -93.4707), (44.8517, -94.309), (44.8802, -94.0465), (44.936, -95.7351), (45.4528, -95.009), (43.8444, -93.7624), (43.5129, -92.2036), (44.41, -96.15), (48.02, -94.98), (43.54, -94.67), (45.1407, -94.7912), (45.1396, -94.7571), (44.3406, -93.0498), (44.3424, -93.0405), (44.4935, -92.7473), (43.7077, -94.2478), (44.0432, -94.1665), (44.1155, -94.0785), (44.1186, -93.6027), (44.1816, -93.6305), (44.0797, -93.4774), (44.4013, -93.293), (44.2672, -92.9006), (44.5419, -92.9688), (44.5829, -92.975), (44.537, -92.919), (44.3353, -92.6512), (44.3665, -92.5506), (44.3684, -92.54), (44.6382, -92.6787), (47.18, -96.07), (47.185, -96.065), (43.5837, -93.2778), (43.5482, -92.3902), (43.6029, -92.3531), (43.6835, -92.3285), (43.5005, -92.2577), (44.0028, -96.3268), (44.046, -93.2279), (44.2028, -94.8309), (44.1049, -94.6351), (43.76, -93.17), (44.5843, -93.9426), (44.2999, -93.6969), (46.7, -96.74), (46.84, -96.45), (44.5962, -93.6957), (48.48, -95.22), (48.0519, -92.6779), (43.9355, -91.5095), (44.8053, -94.3299), (44.8555, -94.1804), (43.8698, -95.1275), (43.61, -94.41), (46.28, -95.44), (43.9761, -93.3639), (44.5721, -93.1216), (43.9071, -95.7834), (43.7162, -95.7486), (44.1095, -95.7834), (44.1313, -95.7583), (43.86, -95.2617), (45.9851, -93.7478), (46.4339, -93.7791), (44.9632, -93.7922), (45.7518, -93.3108), (44.443, -92.2769), (44.7455, -93.1139), (43.6501, -93.0624), (44.8795, -95.1011), (48.23, -96.72), (44.0015, -91.8063), (44.0318, -91.7238), (48.42, -95.83), (46.24, -95.52), (46.2639, -93.9951), (44.2047, -94.1376), (45.6716, -93.1011), (44.8809, -92.9104), (46.77, -96.25), (44.6018, -94.2095), (46.93, -95.12), (47.42, -96.27), (45.97, -96.19), (45.97, -96.24), (45.95, -96.12), (45.7443, -95.6223), (43.6655, -95.666), (45.8545, -95.2105), (45.8278, -95.0473), (46.2176, -94.6484), (45.2072, -94.986), (46.0678, -94.294), (44.767, -94.3406), (45.1899, -94.8422), (45.7759, -94.4501), (45.2391, -94.7576), (45.2433, -94.7635), (45.9745, -93.8804), (45.0452, -94.5504), (44.8404, -93.7498), (46.54, -93.01), (45.1378, -93.657), (45.0351, -93.368), (44.7089, -96.2051), (44.1849, -93.314), (44.1989, -93.3035), (44.1965, -93.3154), (44.2135, -93.3378), (44.5394, -93.8252), (44.5432, -93.825), (44.4055, -93.3002), (44.6251, -93.2647), (46.29, -95.66), (47.78, -94.86), (47.89, -96.73), (45.4506, -95.003), (45.5726, -94.5994), (44.2441, -94.8627), (44.226, -94.8068), (44.3038, -94.641), (43.8756, -93.5422), (43.8758, -93.5136), (44.4824, -93.952), (44.7507, -93.383), (44.7906, -93.2434), (44.7388, -93.2146), (44.0331, -92.2261), (45.68, -96.46), (45.76, -96.29), (45.3, -96.34), (45.51, -96.5), (45.5, -96.56), (45.8019, -96.293), (43.4997, -93.6704), (43.5323, -93.3052), (43.4997, -93.0559), (43.5142, -93.0493), (43.6113, -93.2505), (43.894, -93.0592), (43.7763, -92.37), (43.6699, -92.0805), (43.9605, -91.7908), (43.8998, -91.9822), (43.9088, -91.897), (47.68, -96.47), (44.6497, -92.8613), (43.9483, -95.4623), (44.2215, -94.4585), (44.6386, -93.9182), (44.6748, -93.8888), (45.1844, -93.272), (45.3534, -95.2893), (45.6663, -95.5279), (45.843, -96.2661), (45.7447, -94.9528), (46.3123, -95.5963), (46.1159, -94.6098), (46.9438, -92.0955), (43.5719, -95.9786), (45.1454, -95.9201), (44.8885, -94.0098), (44.9783, -93.9716), (47.229, -93.6597), (48.1, -95.63), (46.86, -96.5), (48.13, -95.66), (48.8626, -96.7817), (43.8605, -91.8798), (43.7967, -91.6015), (43.7162, -93.6339), (44.7152, -93.2642), (44.7636, -93.2156), (44.8248, -93.1569), (44.9018, -93.07), (44.9467, -93.035), (44.9004, -95.261), (44.8988, -95.2485), (44.8185, -95.4746), (43.8942, -95.1545), (47.41, -96.6), (47.4991, -96.7571), (45.3009, -94.9739), (43.818, -95.505), (45.1051, -93.8302)]

In [91]:

mn_counties= df["CZ_NAME_STR"].unique()

In [92]:

mn_counties

Out[92]:

array(['POPE CO.', 'STEARNS CO.', 'RED LAKE CO.', 'CLAY CO.',
       'PIPESTONE CO.', 'BLUE EARTH CO.', 'LE SUEUR CO.', 'RICE CO.',
       'BIG STONE CO.', 'TRAVERSE CO.', 'LAKE OF THE WOODS CO.',
       'ROSEAU CO.', 'WADENA CO.', 'CASS CO.', 'AITKIN CO.', 'ITASCA CO.',
       'ST. LOUIS CO.', 'CROW WING CO.', 'SWIFT CO.', 'GOODHUE CO.',
       'WABASHA CO.', 'MEEKER CO.', 'POLK CO.', 'BELTRAMI CO.',
       'MCLEOD CO.', 'FREEBORN CO.', 'NORMAN CO.', 'MORRISON CO.',
       'FARIBAULT CO.', 'SHERBURNE CO.', 'STEELE CO.', 'HUBBARD CO.',
       'KANDIYOHI CO.', 'STEVENS CO.', 'GRANT CO.', 'WILKIN CO.',
       'OLMSTED CO.', 'LAKE CO.', 'NICOLLET CO.', 'ANOKA CO.',
       'CHISAGO CO.', 'WASHINGTON CO.', 'TODD CO.', 'DOUGLAS CO.',
       'LYON CO.', 'BROWN CO.', 'FILLMORE CO.', 'SIBLEY CO.', 'SCOTT CO.',
       'NOBLES CO.', 'CHIPPEWA CO.', 'LINCOLN CO.', 'MARTIN CO.',
       'CLEARWATER CO.', 'WINONA CO.', 'WASECA CO.', 'DAKOTA CO.',
       'MAHNOMEN CO.', 'COOK CO.', 'REDWOOD CO.', 'WATONWAN CO.',
       'OTTER TAIL CO.', 'COTTONWOOD CO.', 'MURRAY CO.', 'MILLE LACS CO.',
       'WRIGHT CO.', 'CARVER CO.', 'HENNEPIN CO.', 'KANABEC CO.',
       'MARSHALL CO.', 'RENVILLE CO.', 'BECKER CO.', 'ISANTI CO.',
       'PENNINGTON CO.', 'CARLTON CO.', 'YELLOW MEDICINE CO.',
       'MOWER CO.', 'DODGE CO.', 'HOUSTON CO.', 'LAC QUI PARLE CO.',
       'ROCK CO.', 'PINE CO.', 'KITTSON CO.', 'RAMSEY CO.'], dtype=object)

In [93]:

len(mn_counties)

Out[93]:

Introduction to Dictionary Comprehensions¶

Definition
- Dictionary comprehensions are a concise and efficient way to create dictionaries in Python
- Similar to list comprehensions, provide an elegant way to perform operations and apply conditions to iterables,
- Specifically allow for the creation or transformation of dictionary key-value pairs in a single line of code

Why Do We Use Dictionary Comprehensions?:

Conciseness: Reduces the complexity and amount of code compared to traditional loops for creating dictionaries, making the code more readable
Performance: Generally faster than using a loop to add items to a dictionary due to optimized implementation and reduced overhead
Expressiveness: Enhances code clarity by focusing on the dictionary creation logic rather than the mechanics of looping and inserting key-value pairs
Versatility: Capable of incorporating conditional logic and multiple sources, allowing for sophisticated transformations and filtering in dictionary creation
Key point: Use dictionary comprehensions to efficiently transform or map data into key-value pairs

Syntax:

Basic Structure: A dictionary comprehension consists of curly braces {} containing a key-value pair expression followed by a for clause
Optionally, it can include one or more for or if clauses

Generalized Examples:¶

{key_expr: value_expr for item in iterable}
{key_expr: value_expr for item in iterable if condition}
{key_expr: value_expr for item in iterable if condition1 if condition2}
{key_expr: value_expr for item in iterable1 for item2 in iterable2}

In [94]:

# quick recap how to manipulate dictionary
from datetime import datetime

from pprint import pprint # pretty print, readable format for dictionaries

# create an empty dictionary for products and their metadata
products_dict = {}

# add to the dictionary
# This is a nested dictionary, where the key is product ID and the value is another dictionary
products_dict['0001-2024']= {
    'name':'apple', 
    'amount':2, 
    'date': datetime.now().date().strftime('%Y%m%d')}

# display the dictionary
pprint(products_dict)

products_dict['0002-2024'] = {'name': 'banana'}

# display each item dictionary entry along with labels
for productID, metadata in products_dict.items():
    pprint(f"Product ID: {productID}, Metadata: {metadata}")

# print the number of items
print(f"Number of items: {len(products_dict)}")

# update an existing entry 
# allows you to add new key value pairs or update existing ones
products_dict['0002-2024'].update({'amount':10, 'date': datetime.now().strftime('%Y%m%d')})

#Dipsplay the items
pprint(products_dict.items())
pprint(list(products_dict.items()))


products_dict.update({'0003-2024':{'name': 'mangos','amount':100, 'date': datetime.now().strftime('%Y%m%d')}})

#Dipsplay the items
print(f"Number of items: {len(products_dict)}")

# display the dictionary
pprint(products_dict)

print()

counter=0
# List out the final nested dictionary of products
print("Final product list:")
for productID, metadata in products_dict.items():
    
    pprint(f"PRoduct ID {counter+1} : {productID}, Metadata: {metadata}")
    counter+=1

{'0001-2024': {'amount': 2, 'date': '20240818', 'name': 'apple'}}
("Product ID: 0001-2024, Metadata: {'name': 'apple', 'amount': 2, 'date': "
 "'20240818'}")
"Product ID: 0002-2024, Metadata: {'name': 'banana'}"
Number of items: 2
dict_items([('0001-2024', {'name': 'apple', 'amount': 2, 'date': '20240818'}), ('0002-2024', {'name': 'banana', 'amount': 10, 'date': '20240818'})])
[('0001-2024', {'amount': 2, 'date': '20240818', 'name': 'apple'}),
 ('0002-2024', {'amount': 10, 'date': '20240818', 'name': 'banana'})]
Number of items: 3
{'0001-2024': {'amount': 2, 'date': '20240818', 'name': 'apple'},
 '0002-2024': {'amount': 10, 'date': '20240818', 'name': 'banana'},
 '0003-2024': {'amount': 100, 'date': '20240818', 'name': 'mangos'}}

Final product list:
("PRoduct ID 1 : 0001-2024, Metadata: {'name': 'apple', 'amount': 2, 'date': "
 "'20240818'}")
("PRoduct ID 2 : 0002-2024, Metadata: {'name': 'banana', 'amount': 10, 'date': "
 "'20240818'}")
("PRoduct ID 3 : 0003-2024, Metadata: {'name': 'mangos', 'amount': 100, "
 "'date': '20240818'}")

Basic Dictionary Comprehension¶

In [95]:

# Make a dictionary where the keys are numbers and values are their squares

# Basic components of dictionary comprehension
# (1) key-value expression
# (2) item
# (3) iterable

            #(1)     (2)     (3)
squares = {x:x**2 for x in range(1,10)}
print(squares)

{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

Conditional Dicionary Comprehension¶

In [96]:

# Make  a dicionary of even numbers and their squares
even_squares = {x:x**2 for x in range(1,10) if x%2==0}

print(even_squares)

{2: 4, 4: 16, 6: 36, 8: 64}

Using Functions in Dictionary Comprehension¶

In [97]:

# Create a dictionary that maps each word in a list to its length

# suppose you start with a list of words
word_list = ["apple", "banana", "cherry"]


word_length_dict=  {word:len(word) for word in word_list}

print(word_length_dict)

{'apple': 5, 'banana': 6, 'cherry': 6}

Using a dataframe in a Dictionary Comprehension¶

In [28]:

import pandas as pd
# Create a dictionary that maps each word in a column of a dataframe to its length

# You start with a dataframe
data = {'Words':(["apple", "banana", "cherry"])*2} # Duplicate the list to increase the number of items
print(data)


df = pd.DataFrame(data)
print(df)

# use a dictionary comprehension directly on the dataframe column to map each word in the column to its length
word_length_dict=  {word:len(word) for word in data['Words']}

print(word_length_dict)

# add a count to the same dataframe as a new column
df["Word Counts"]=df.groupby("Words")["Words"].transform("count")

print(f"\n{df}\n")

# Map the lengths fromt dictionary to the key names in the column Words
df["Word Lenghths"] = df["Words"].map(word_length_dict)
print(f"\n{df}\n")

{'Words': ['apple', 'banana', 'cherry', 'apple', 'banana', 'cherry']}
    Words
0   apple
1  banana
2  cherry
3   apple
4  banana
5  cherry
{'apple': 5, 'banana': 6, 'cherry': 6}

    Words  Word Counts
0   apple            2
1  banana            2
2  cherry            2
3   apple            2
4  banana            2
5  cherry            2


    Words  Word Counts  Word Lenghths
0   apple            2              5
1  banana            2              6
2  cherry            2              6
3   apple            2              5
4  banana            2              6
5  cherry            2              6

In [99]:

# Read HTML tables using the lxml parser
counties_list = pd.read_html(
    "https://en.wikipedia.org/wiki/List_of_counties_in_Minnesota"
)

In [100]:

counties_list=counties_list[0]

In [101]:

counties_list

Out[101]:

	County	FIPS code[3]	County seat[4]	Est.[1][4]	Origin[5][6][7]	Etymology	Population[8]	Area[4][8]	Map
0	Aitkin County	1	Aitkin	1857	Pine County, Ramsey County	William Alexander Aitken (1785–1851), early fu...	16102	1,819.30 sq mi (4,712 km2)	NaN
1	Anoka County	3	Anoka	1857	Ramsey County	Dakota word meaning "both sides"	372441	423.61 sq mi (1,097 km2)	NaN
2	Becker County	5	Detroit Lakes	1858	Cass County, Pembina County	George Loomis Becker, former state senator and...	35283	1,310.42 sq mi (3,394 km2)	NaN
3	Beltrami County	7	Bemidji	1866	Unorganized Territory, Itasca County, Pembina ...	Giacomo Beltrami, Italian explorer who explore...	46718	2,505.27 sq mi (6,489 km2)	NaN
4	Benton County	9	Foley	1849	One of nine original counties; formed from res...	Thomas Hart Benton (1782–1858), former United ...	41600	408.28 sq mi (1,057 km2)	NaN
...	...	...	...	...	...	...	...	...	...
82	Watonwan County	165	St. James	1860	Brown County	Watonwan River, a river that flows through Min...	11077	434.51 sq mi (1,125 km2)	NaN
83	Wilkin County	167	Breckenridge	1858	Cass County, Pembina County	Alexander Wilkin (1820–1864), Minnesota politi...	6306	751.43 sq mi (1,946 km2)	NaN
84	Winona County	169	Winona	1854	Fillmore County, Wabasha County	Named after Wee-No-Nah, Sister, or Cousin of C...	49721	626.30 sq mi (1,622 km2)	NaN
85	Wright County	171	Buffalo	1855	Cass County, Sibley County	Silas Wright (1795–1847), former United States...	151150	660.75 sq mi (1,711 km2)	NaN
86	Yellow Medicine County	173	Granite Falls	1871	Redwood County	Yellow Medicine River, a river that flows thro...	9467	757.96 sq mi (1,963 km2)	NaN

87 rows × 9 columns

Introduction to Geocoding with Nominatim via Geopy¶

Geocoding is the process of converting addresses (like "1600 Amphitheatre Parkway, Mountain View, CA") into geographic coordinates (like latitude 37.423021 and longitude -122.083739)
can use to place markers on a map, or position the map

Capabilities of Nominatim (Geopy):

Address Geocoding: Converts street addresses or other descriptive locations into geographic coordinates.
Reverse Geocoding: Converts geographic coordinates into a human-readable address.
Extensive Coverage: Utilizes OpenStreetMap data, providing global coverage often with fine-grained control over geocoding queries.
Customization Options: Allows customization of requests, including specifying the language of the result, the bounding box for constraining searches, and more.

Syntax for Geocoding and Reverse Geocoding 1) Geocoding (Address to Coordinates)

- Initialization: Create a Nominatim object with a user-defined user_agent
- Query: Use the .geocode() method with the address as a string.

2) Reverse Geocoding (Coordinates to Address)

- Initialization: Create a Nominatim object with a user-defined user_agent
- Query: Use the .reverse() method with a string in the format "latitude, longitude".

In [45]:

from geopy.geocoders import Nominatim
import requests

geolocator = Nominatim(user_agent="geocode_Address")

def getAddress_coords(address):
    location = geolocator.geocode(address)
    if location:
        latitude, longitude = location.latitude, location.longitude
        print(location)
        
        # Get elevation in meters
        elevation_url = f"https://api.open-elevation.com/api/v1/lookup?locations={latitude},{longitude}"
        response = requests.get(elevation_url)
        elevation_data = response.json()
        print(elevation_data)
        elevation_meters = elevation_data['results'][0]['elevation'] if 'results' in elevation_data else None
        print(elevation_meters)
        #convert elevation to feet
        elevation_feet = elevation_meters * 3.28084 if elevation_meters is not None else None
        print(elevation_feet)
        return (latitude, longitude, elevation_feet)
    else:
        print("Address Not Found, coordinates will be blank")
        return (None,None, None)

In [46]:

geocode_result = getAddress_coords("5057 Edgewater Court, Savage, MN")
print(geocode_result)

5057, Edgewater Court, Savage, Scott County, Minnesota, 55378, United States
{'results': [{'latitude': 44.730098, 'longitude': -93.343248, 'elevation': 271.0}]}
271.0
889.10764
(44.73009822071884, -93.3432476572783, 889.10764)

In [41]:

from geopy.geocoders import Nominatim

#initialize geocoder, Nominatim object

geolocator= Nominatim(user_agent= "geocode_Address")

def getAddress_coords(address):
    location= geolocator.geocode(address)
    if location:
        print(location)
        return (location.latitude, location.longitude)
    else:
        print("Address Not Found, coodinates will be blank")
        return(None,None)

geocode_result = getAddress_coords("5057 Edgewater Court, Savage, MN")

print(geocode_result)

5057, Edgewater Court, Savage, Scott County, Minnesota, 55378, United States
(44.73009822071884, -93.3432476572783)

In [42]:

geocode_result = getAddress_coords("Murphy-Hanrehan Park Reserve, Savage, MN")

print(geocode_result)

Murphy-Hanrehan Park Reserve, 15501, Savage, Scott County, Minnesota, 55378, United States
(44.71070400000001, -93.3343809887289)

In [104]:

geolocator= Nominatim(user_agent= "geocode_Address")

def getAddress(coords):
    location= geolocator.reverse(coords)
    if location:
        return (location.address)
    else:
        print("Address Not Found, coodinates will be blank")

geocode_Address_result = getAddress(geocode_result)

print(geocode_Address_result)

Sunset Lake Road, Credit River, Scott County, Minnesota, 55306, United States

In [105]:

# Combine retrieval of external data from geocoding service with a dictionary comprehension of a dataframe column
# Create a dictionary tha maps counties to their coordinates

from geopy.geocoders import Nominatim

# initialize geocoder

geolocator= Nominatim(user_agent= "geoapiExercise")

def get_lat_lon(county):
    # Append ", Minnesota" to ensure the geocoding query is localized
    location= geolocator.geocode(county+ ", Minnesota")
    if location:
        return (location.latitude, location.longitude)
    else:
        return (None, None)
        
# county_names= counties_list["County"]


#print the names of the county
print(county_names)
print(county_names.dtype)

# Dictionary comprehension that maps a dataframe column of county names to their lat, lon coordinates 
# the function of the dictionary comprehension returns the coordinates for each key in the dictionary defined by the dataframe column
coordinates_list = {county: get_lat_lon(county) for county in  counties_list["County"]}

0              Aitkin County
1               Anoka County
2              Becker County
3            Beltrami County
4              Benton County
               ...          
82           Watonwan County
83             Wilkin County
84             Winona County
85             Wright County
86    Yellow Medicine County
Name: County, Length: 87, dtype: object
object

In [106]:

print(f"The dataset is an {type(county_names)}")
print(f"The data value in the dataset is an {county_names.dtype}")
print(county_names.index.to_list())

The dataset is an <class 'pandas.core.series.Series'>
The data value in the dataset is an object
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86]

In [107]:

coordinates_list

Out[107]:

{'Aitkin County': (46.5714822, -93.3847595),
 'Anoka County': (45.2710195, -93.2827625),
 'Becker County': (46.9298236, -95.6761851),
 'Beltrami County': (47.9978537, -94.8799011),
 'Benton County': (45.7162129, -94.0481042),
 'Big Stone County': (45.385266, -96.3557364),
 'Blue Earth County': (44.0109722, -94.0560643),
 'Brown County': (44.2350232, -94.6955051),
 'Carlton County': (46.5799933, -92.7206334),
 'Carver County': (44.807118, -93.7871792),
 'Cass County': (47.0234117, -94.3454604),
 'Chippewa County': (45.027661, -95.5314914),
 'Chisago County': (45.4758877, -92.8849411),
 'Clay County': (46.8994904, -96.5088202),
 'Clearwater County': (47.5643825, -95.3747844),
 'Cook County': (47.9149076, -90.47301),
 'Cottonwood County': (44.019068, -95.1658845),
 'Crow Wing County': (46.4665237, -94.1017044),
 'Dakota County': (44.666655, -93.044911),
 'Dodge County': (44.0175404, -92.8678406),
 'Douglas County': (45.9340479, -95.4627651),
 'Faribault County': (43.6647961, -93.9510501),
 'Fillmore County': (43.6466588, -92.0636359),
 'Freeborn County': (43.6763617, -93.3501681),
 'Goodhue County': (44.396973, -92.7175627),
 'Grant County': (45.9358795, -96.0272071),
 'Hennepin County': (45.0257232, -93.4865052),
 'Houston County': (43.6624222, -91.4685617),
 'Hubbard County': (47.1138266, -94.9427679),
 'Isanti County': (45.56932235, -93.32652095523574),
 'Itasca County': (47.4968343, -93.6225663),
 'Jackson County': (43.670011, -95.1500626),
 'Kanabec County': (45.8986948, -93.2850016),
 'Kandiyohi County': (45.142373, -95.0025846),
 'Kittson County': (48.7709208, -96.8074141),
 'Koochiching County': (48.221596, -93.7684251),
 'Lac qui Parle County': (44.986426, -96.2024907),
 'Lake County': (47.6348022, -91.4394994),
 'Lake of the Woods County': (48.7032282, -94.8480091),
 'Le Sueur County': (44.3771652, -93.711443),
 'Lincoln County': (44.4020631, -96.2627763),
 'Lyon County': (44.3880733, -95.8287296),
 'McLeod County': (44.8169135, -94.2495251),
 'Mahnomen County': (47.3313602, -95.8142911),
 'Marshall County': (48.3605336, -96.381968),
 'Martin County': (43.6564337, -94.5498419),
 'Meeker County': (45.1183643, -94.5175345),
 'Mille Lacs County': (45.9311972, -93.640356),
 'Morrison County': (45.9926837, -94.2554658),
 'Mower County': (43.6832277, -92.753704),
 'Murray County': (44.017855, -95.7615205),
 'Nicollet County': (44.3380412, -94.2362169),
 'Nobles County': (43.6634212, -95.7527672),
 'Norman County': (47.3194344, -96.4625779),
 'Olmsted County': (43.9997437, -92.3767816),
 'Otter Tail County': (46.4184196, -95.713142),
 'Pennington County': (48.0513335, -96.0829271),
 'Pine County': (46.0820957, -92.7542126),
 'Pipestone County': (44.0270012, -96.2566582),
 'Polk County': (47.6554613, -96.4193484),
 'Pope County': (45.5850258, -95.4469471),
 'Ramsey County': (45.0165728, -93.0949501),
 'Red Lake County': (47.8605178, -96.0988343),
 'Redwood County': (44.3788613, -95.2532373),
 'Renville County': (44.7242874, -94.9084771),
 'Rice County': (44.3413376, -93.2865484),
 'Rock County': (43.6733632, -96.2574328),
 'Roseau County': (48.7710371, -95.7697882),
 'Saint Louis County': (47.6201005, -92.4363343),
 'Scott County': (44.6506998, -93.5025726),
 'Sherburne County': (45.4427088, -93.7459202),
 'Sibley County': (44.5603522, -94.2085682),
 'Stearns County': (45.535326, -94.6139422),
 'Steele County': (44.0137336, -93.2203671),
 'Stevens County': (45.5837016, -95.9946194),
 'Swift County': (45.2797223, -95.6898654),
 'Todd County': (46.0588428, -94.887283),
 'Traverse County': (45.7836323, -96.4215265),
 'Wabasha County': (44.2767596, -92.2018164),
 'Wadena County': (46.5850936, -94.9606684),
 'Waseca County': (44.0172242, -93.5885717),
 'Washington County': (45.0078657, -92.874565),
 'Watonwan County': (43.9736055, -94.6370354),
 'Wilkin County': (46.3258354, -96.4586194),
 'Winona County': (43.9582272, -91.7807784),
 'Wright County': (45.1489061, -93.9639196),
 'Yellow Medicine County': (44.7198536, -95.8533555)}

In [108]:

type(coordinates_list)

Out[108]:

dict

In [109]:

coordinates_list.items() # converts dictionary to an iterable of tuples (key,value)

Out[109]:

dict_items([('Aitkin County', (46.5714822, -93.3847595)), ('Anoka County', (45.2710195, -93.2827625)), ('Becker County', (46.9298236, -95.6761851)), ('Beltrami County', (47.9978537, -94.8799011)), ('Benton County', (45.7162129, -94.0481042)), ('Big Stone County', (45.385266, -96.3557364)), ('Blue Earth County', (44.0109722, -94.0560643)), ('Brown County', (44.2350232, -94.6955051)), ('Carlton County', (46.5799933, -92.7206334)), ('Carver County', (44.807118, -93.7871792)), ('Cass County', (47.0234117, -94.3454604)), ('Chippewa County', (45.027661, -95.5314914)), ('Chisago County', (45.4758877, -92.8849411)), ('Clay County', (46.8994904, -96.5088202)), ('Clearwater County', (47.5643825, -95.3747844)), ('Cook County', (47.9149076, -90.47301)), ('Cottonwood County', (44.019068, -95.1658845)), ('Crow Wing County', (46.4665237, -94.1017044)), ('Dakota County', (44.666655, -93.044911)), ('Dodge County', (44.0175404, -92.8678406)), ('Douglas County', (45.9340479, -95.4627651)), ('Faribault County', (43.6647961, -93.9510501)), ('Fillmore County', (43.6466588, -92.0636359)), ('Freeborn County', (43.6763617, -93.3501681)), ('Goodhue County', (44.396973, -92.7175627)), ('Grant County', (45.9358795, -96.0272071)), ('Hennepin County', (45.0257232, -93.4865052)), ('Houston County', (43.6624222, -91.4685617)), ('Hubbard County', (47.1138266, -94.9427679)), ('Isanti County', (45.56932235, -93.32652095523574)), ('Itasca County', (47.4968343, -93.6225663)), ('Jackson County', (43.670011, -95.1500626)), ('Kanabec County', (45.8986948, -93.2850016)), ('Kandiyohi County', (45.142373, -95.0025846)), ('Kittson County', (48.7709208, -96.8074141)), ('Koochiching County', (48.221596, -93.7684251)), ('Lac qui Parle County', (44.986426, -96.2024907)), ('Lake County', (47.6348022, -91.4394994)), ('Lake of the Woods County', (48.7032282, -94.8480091)), ('Le Sueur County', (44.3771652, -93.711443)), ('Lincoln County', (44.4020631, -96.2627763)), ('Lyon County', (44.3880733, -95.8287296)), ('McLeod County', (44.8169135, -94.2495251)), ('Mahnomen County', (47.3313602, -95.8142911)), ('Marshall County', (48.3605336, -96.381968)), ('Martin County', (43.6564337, -94.5498419)), ('Meeker County', (45.1183643, -94.5175345)), ('Mille Lacs County', (45.9311972, -93.640356)), ('Morrison County', (45.9926837, -94.2554658)), ('Mower County', (43.6832277, -92.753704)), ('Murray County', (44.017855, -95.7615205)), ('Nicollet County', (44.3380412, -94.2362169)), ('Nobles County', (43.6634212, -95.7527672)), ('Norman County', (47.3194344, -96.4625779)), ('Olmsted County', (43.9997437, -92.3767816)), ('Otter Tail County', (46.4184196, -95.713142)), ('Pennington County', (48.0513335, -96.0829271)), ('Pine County', (46.0820957, -92.7542126)), ('Pipestone County', (44.0270012, -96.2566582)), ('Polk County', (47.6554613, -96.4193484)), ('Pope County', (45.5850258, -95.4469471)), ('Ramsey County', (45.0165728, -93.0949501)), ('Red Lake County', (47.8605178, -96.0988343)), ('Redwood County', (44.3788613, -95.2532373)), ('Renville County', (44.7242874, -94.9084771)), ('Rice County', (44.3413376, -93.2865484)), ('Rock County', (43.6733632, -96.2574328)), ('Roseau County', (48.7710371, -95.7697882)), ('Saint Louis County', (47.6201005, -92.4363343)), ('Scott County', (44.6506998, -93.5025726)), ('Sherburne County', (45.4427088, -93.7459202)), ('Sibley County', (44.5603522, -94.2085682)), ('Stearns County', (45.535326, -94.6139422)), ('Steele County', (44.0137336, -93.2203671)), ('Stevens County', (45.5837016, -95.9946194)), ('Swift County', (45.2797223, -95.6898654)), ('Todd County', (46.0588428, -94.887283)), ('Traverse County', (45.7836323, -96.4215265)), ('Wabasha County', (44.2767596, -92.2018164)), ('Wadena County', (46.5850936, -94.9606684)), ('Waseca County', (44.0172242, -93.5885717)), ('Washington County', (45.0078657, -92.874565)), ('Watonwan County', (43.9736055, -94.6370354)), ('Wilkin County', (46.3258354, -96.4586194)), ('Winona County', (43.9582272, -91.7807784)), ('Wright County', (45.1489061, -93.9639196)), ('Yellow Medicine County', (44.7198536, -95.8533555))])

In [124]:

# items() method of dictionaries returns an iterable of tuples
# each tuple consist of key-value pairs from the dictionary
type(coordinates_list.items()) # this dict_items object is an iterable

# because this is an iterable , we can use it in a loop to access its elements 
#...OR convert it to other iterables like lists that are often required for further data processing

Out[124]:

dict_items

In [130]:

for county, data in coordinates_list.items():
        print(county, data)  # prints each county and its associated data

Aitkin County (46.5714822, -93.3847595)
Anoka County (45.2710195, -93.2827625)
Becker County (46.9298236, -95.6761851)
Beltrami County (47.9978537, -94.8799011)
Benton County (45.7162129, -94.0481042)
Big Stone County (45.385266, -96.3557364)
Blue Earth County (44.0109722, -94.0560643)
Brown County (44.2350232, -94.6955051)
Carlton County (46.5799933, -92.7206334)
Carver County (44.807118, -93.7871792)
Cass County (47.0234117, -94.3454604)
Chippewa County (45.027661, -95.5314914)
Chisago County (45.4758877, -92.8849411)
Clay County (46.8994904, -96.5088202)
Clearwater County (47.5643825, -95.3747844)
Cook County (47.9149076, -90.47301)
Cottonwood County (44.019068, -95.1658845)
Crow Wing County (46.4665237, -94.1017044)
Dakota County (44.666655, -93.044911)
Dodge County (44.0175404, -92.8678406)
Douglas County (45.9340479, -95.4627651)
Faribault County (43.6647961, -93.9510501)
Fillmore County (43.6466588, -92.0636359)
Freeborn County (43.6763617, -93.3501681)
Goodhue County (44.396973, -92.7175627)
Grant County (45.9358795, -96.0272071)
Hennepin County (45.0257232, -93.4865052)
Houston County (43.6624222, -91.4685617)
Hubbard County (47.1138266, -94.9427679)
Isanti County (45.56932235, -93.32652095523574)
Itasca County (47.4968343, -93.6225663)
Jackson County (43.670011, -95.1500626)
Kanabec County (45.8986948, -93.2850016)
Kandiyohi County (45.142373, -95.0025846)
Kittson County (48.7709208, -96.8074141)
Koochiching County (48.221596, -93.7684251)
Lac qui Parle County (44.986426, -96.2024907)
Lake County (47.6348022, -91.4394994)
Lake of the Woods County (48.7032282, -94.8480091)
Le Sueur County (44.3771652, -93.711443)
Lincoln County (44.4020631, -96.2627763)
Lyon County (44.3880733, -95.8287296)
McLeod County (44.8169135, -94.2495251)
Mahnomen County (47.3313602, -95.8142911)
Marshall County (48.3605336, -96.381968)
Martin County (43.6564337, -94.5498419)
Meeker County (45.1183643, -94.5175345)
Mille Lacs County (45.9311972, -93.640356)
Morrison County (45.9926837, -94.2554658)
Mower County (43.6832277, -92.753704)
Murray County (44.017855, -95.7615205)
Nicollet County (44.3380412, -94.2362169)
Nobles County (43.6634212, -95.7527672)
Norman County (47.3194344, -96.4625779)
Olmsted County (43.9997437, -92.3767816)
Otter Tail County (46.4184196, -95.713142)
Pennington County (48.0513335, -96.0829271)
Pine County (46.0820957, -92.7542126)
Pipestone County (44.0270012, -96.2566582)
Polk County (47.6554613, -96.4193484)
Pope County (45.5850258, -95.4469471)
Ramsey County (45.0165728, -93.0949501)
Red Lake County (47.8605178, -96.0988343)
Redwood County (44.3788613, -95.2532373)
Renville County (44.7242874, -94.9084771)
Rice County (44.3413376, -93.2865484)
Rock County (43.6733632, -96.2574328)
Roseau County (48.7710371, -95.7697882)
Saint Louis County (47.6201005, -92.4363343)
Scott County (44.6506998, -93.5025726)
Sherburne County (45.4427088, -93.7459202)
Sibley County (44.5603522, -94.2085682)
Stearns County (45.535326, -94.6139422)
Steele County (44.0137336, -93.2203671)
Stevens County (45.5837016, -95.9946194)
Swift County (45.2797223, -95.6898654)
Todd County (46.0588428, -94.887283)
Traverse County (45.7836323, -96.4215265)
Wabasha County (44.2767596, -92.2018164)
Wadena County (46.5850936, -94.9606684)
Waseca County (44.0172242, -93.5885717)
Washington County (45.0078657, -92.874565)
Watonwan County (43.9736055, -94.6370354)
Wilkin County (46.3258354, -96.4586194)
Winona County (43.9582272, -91.7807784)
Wright County (45.1489061, -93.9639196)
Yellow Medicine County (44.7198536, -95.8533555)

In [129]:

# we should recognize that the dictionary items are a list of tuple pairs
for county, data in coordinates_list.items():
    if county == 'Scott County':
        print(county, data)  #  prints county and its associated data

Scott County (44.6506998, -93.5025726)

In [120]:

# recall that dataframes can be made from list of tuples

list_dict= [('apples', (100, 2)), ('pears', (20,3))]

dict_df= pd.DataFrame(list_dict, columns=['Fruit', 'Data'])

dict_df


# Critical to recognize dictionary can be converted to list of tuples
# because pandas DataFrames can be constructed efficiently from lists of tuples, 
# each tuple is a row and each element of the tuple a column

Out[120]:

	Fruit	Data
0	apples	(100, 2)
1	pears	(20, 3)

In [110]:

#Knowing that iterable of tuples form .items() can be converted into a list of tuples
#  allows for straightforward creation of a DataFrame.
list(coordinates_list.items())

# because pandas DataFrames can be created  from lists of tuples
# each tuple is a row and each element of the tuple a column

Out[110]:

[('Aitkin County', (46.5714822, -93.3847595)),
 ('Anoka County', (45.2710195, -93.2827625)),
 ('Becker County', (46.9298236, -95.6761851)),
 ('Beltrami County', (47.9978537, -94.8799011)),
 ('Benton County', (45.7162129, -94.0481042)),
 ('Big Stone County', (45.385266, -96.3557364)),
 ('Blue Earth County', (44.0109722, -94.0560643)),
 ('Brown County', (44.2350232, -94.6955051)),
 ('Carlton County', (46.5799933, -92.7206334)),
 ('Carver County', (44.807118, -93.7871792)),
 ('Cass County', (47.0234117, -94.3454604)),
 ('Chippewa County', (45.027661, -95.5314914)),
 ('Chisago County', (45.4758877, -92.8849411)),
 ('Clay County', (46.8994904, -96.5088202)),
 ('Clearwater County', (47.5643825, -95.3747844)),
 ('Cook County', (47.9149076, -90.47301)),
 ('Cottonwood County', (44.019068, -95.1658845)),
 ('Crow Wing County', (46.4665237, -94.1017044)),
 ('Dakota County', (44.666655, -93.044911)),
 ('Dodge County', (44.0175404, -92.8678406)),
 ('Douglas County', (45.9340479, -95.4627651)),
 ('Faribault County', (43.6647961, -93.9510501)),
 ('Fillmore County', (43.6466588, -92.0636359)),
 ('Freeborn County', (43.6763617, -93.3501681)),
 ('Goodhue County', (44.396973, -92.7175627)),
 ('Grant County', (45.9358795, -96.0272071)),
 ('Hennepin County', (45.0257232, -93.4865052)),
 ('Houston County', (43.6624222, -91.4685617)),
 ('Hubbard County', (47.1138266, -94.9427679)),
 ('Isanti County', (45.56932235, -93.32652095523574)),
 ('Itasca County', (47.4968343, -93.6225663)),
 ('Jackson County', (43.670011, -95.1500626)),
 ('Kanabec County', (45.8986948, -93.2850016)),
 ('Kandiyohi County', (45.142373, -95.0025846)),
 ('Kittson County', (48.7709208, -96.8074141)),
 ('Koochiching County', (48.221596, -93.7684251)),
 ('Lac qui Parle County', (44.986426, -96.2024907)),
 ('Lake County', (47.6348022, -91.4394994)),
 ('Lake of the Woods County', (48.7032282, -94.8480091)),
 ('Le Sueur County', (44.3771652, -93.711443)),
 ('Lincoln County', (44.4020631, -96.2627763)),
 ('Lyon County', (44.3880733, -95.8287296)),
 ('McLeod County', (44.8169135, -94.2495251)),
 ('Mahnomen County', (47.3313602, -95.8142911)),
 ('Marshall County', (48.3605336, -96.381968)),
 ('Martin County', (43.6564337, -94.5498419)),
 ('Meeker County', (45.1183643, -94.5175345)),
 ('Mille Lacs County', (45.9311972, -93.640356)),
 ('Morrison County', (45.9926837, -94.2554658)),
 ('Mower County', (43.6832277, -92.753704)),
 ('Murray County', (44.017855, -95.7615205)),
 ('Nicollet County', (44.3380412, -94.2362169)),
 ('Nobles County', (43.6634212, -95.7527672)),
 ('Norman County', (47.3194344, -96.4625779)),
 ('Olmsted County', (43.9997437, -92.3767816)),
 ('Otter Tail County', (46.4184196, -95.713142)),
 ('Pennington County', (48.0513335, -96.0829271)),
 ('Pine County', (46.0820957, -92.7542126)),
 ('Pipestone County', (44.0270012, -96.2566582)),
 ('Polk County', (47.6554613, -96.4193484)),
 ('Pope County', (45.5850258, -95.4469471)),
 ('Ramsey County', (45.0165728, -93.0949501)),
 ('Red Lake County', (47.8605178, -96.0988343)),
 ('Redwood County', (44.3788613, -95.2532373)),
 ('Renville County', (44.7242874, -94.9084771)),
 ('Rice County', (44.3413376, -93.2865484)),
 ('Rock County', (43.6733632, -96.2574328)),
 ('Roseau County', (48.7710371, -95.7697882)),
 ('Saint Louis County', (47.6201005, -92.4363343)),
 ('Scott County', (44.6506998, -93.5025726)),
 ('Sherburne County', (45.4427088, -93.7459202)),
 ('Sibley County', (44.5603522, -94.2085682)),
 ('Stearns County', (45.535326, -94.6139422)),
 ('Steele County', (44.0137336, -93.2203671)),
 ('Stevens County', (45.5837016, -95.9946194)),
 ('Swift County', (45.2797223, -95.6898654)),
 ('Todd County', (46.0588428, -94.887283)),
 ('Traverse County', (45.7836323, -96.4215265)),
 ('Wabasha County', (44.2767596, -92.2018164)),
 ('Wadena County', (46.5850936, -94.9606684)),
 ('Waseca County', (44.0172242, -93.5885717)),
 ('Washington County', (45.0078657, -92.874565)),
 ('Watonwan County', (43.9736055, -94.6370354)),
 ('Wilkin County', (46.3258354, -96.4586194)),
 ('Winona County', (43.9582272, -91.7807784)),
 ('Wright County', (45.1489061, -93.9639196)),
 ('Yellow Medicine County', (44.7198536, -95.8533555))]

In [111]:

len(list(coordinates_list.items()))

Out[111]:

In [112]:

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt


# Convert the coordinates list to a DataFrame
data = pd.DataFrame(list(coordinates_list.items()), columns=["County", "Coordinates"])


#ensure all are tuples with two elements
data['Coordinates'] = data ['Coordinates'].apply(lambda x: x if isinstance(x,tuple) and len(x)==2 else (None, None))

print(len(data))
# Check if any coordinates are (None, None)
none_coordinates = data[data['Coordinates'] == (None, None)]
print(none_coordinates)

# Extract latitude and longitude into separate columns
data[['Latitude', 'Longitude']] = pd.DataFrame(data['Coordinates'].tolist(), index=data.index)

# Initialize the figure and axes for the plots
fig, ax = plt.subplots(figsize=(14, 10), subplot_kw={'projection': ccrs.PlateCarree()})
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')

# Plot the data points
ax.scatter(data['Longitude'], data['Latitude'], color='red', s=50, edgecolor='k', zorder=5)

# Add labels for each point
for i, row in data.iterrows():
    ax.text(row['Longitude'] + 0.02, row['Latitude'] + 0.02, row['County'], fontsize=12)

# Set the title
ax.set_title('County Coordinates in Minnesota')

# Show the plot
plt.show()

87
Empty DataFrame
Columns: [County, Coordinates]
Index: []

In [113]:

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt


# Convert the coordinates list to a DataFrame
data = pd.DataFrame(list(coordinates_list.items()), columns=["County", "Coordinates"])


#ensure all are tuples with two elements
data['Coordinates'] = data ['Coordinates'].apply(lambda x: x if isinstance(x,tuple) and len(x)==2 else (None, None))

print(len(data))
# Check if any coordinates are (None, None)
none_coordinates = data[data['Coordinates'] == (None, None)]
print(none_coordinates)
print(data['Coordinates'])
# Extract latitude and longitude into separate columns
data['Latitude'], data['Longitude'] = zip(*data['Coordinates'])
print(data['Coordinates'])

print(data['Latitude'])
print(data['Longitude'])


# Initialize the figure and axes for the plots
fig, ax = plt.subplots(figsize=(14, 10), subplot_kw={'projection': ccrs.PlateCarree()})
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')

# Plot the data points
ax.scatter(data['Longitude'], data['Latitude'], color='red', s=50, edgecolor='k', zorder=5)

# Add labels for each point
for i, row in data.iterrows():
    ax.text(row['Longitude'] + 0.02, row['Latitude'] + 0.02, row['County'], fontsize=12)

# Set the title
ax.set_title('County Coordinates in Minnesota')

# Show the plot
plt.show()

87
Empty DataFrame
Columns: [County, Coordinates]
Index: []
0     (46.5714822, -93.3847595)
1     (45.2710195, -93.2827625)
2     (46.9298236, -95.6761851)
3     (47.9978537, -94.8799011)
4     (45.7162129, -94.0481042)
                ...            
82    (43.9736055, -94.6370354)
83    (46.3258354, -96.4586194)
84    (43.9582272, -91.7807784)
85    (45.1489061, -93.9639196)
86    (44.7198536, -95.8533555)
Name: Coordinates, Length: 87, dtype: object
0     (46.5714822, -93.3847595)
1     (45.2710195, -93.2827625)
2     (46.9298236, -95.6761851)
3     (47.9978537, -94.8799011)
4     (45.7162129, -94.0481042)
                ...            
82    (43.9736055, -94.6370354)
83    (46.3258354, -96.4586194)
84    (43.9582272, -91.7807784)
85    (45.1489061, -93.9639196)
86    (44.7198536, -95.8533555)
Name: Coordinates, Length: 87, dtype: object
0     46.571482
1     45.271020
2     46.929824
3     47.997854
4     45.716213
        ...    
82    43.973605
83    46.325835
84    43.958227
85    45.148906
86    44.719854
Name: Latitude, Length: 87, dtype: float64
0    -93.384760
1    -93.282763
2    -95.676185
3    -94.879901
4    -94.048104
        ...    
82   -94.637035
83   -96.458619
84   -91.780778
85   -93.963920
86   -95.853356
Name: Longitude, Length: 87, dtype: float64

In [114]:

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import cartopy.io.shapereader as shpreader

# Path to the Natural Earth shapefile
shapefile_path = r'G:\My Drive\Python_projects\my_git_pages_website\Py-and-Sky-Labs\content\Python Examples\Data\US_County_borders\ne_10m_admin_2_counties.shp'

# Initialize the figure and axes for the plots
fig, ax = plt.subplots(figsize=(14, 10), subplot_kw={'projection': ccrs.PlateCarree()})

# Add built-in Cartopy features
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')

# Load and plot the county boundaries
reader = shpreader.Reader(shapefile_path)
counties = list(reader.geometries())
ax.add_geometries(counties, ccrs.PlateCarree(), edgecolor='black', facecolor='none')

# Assuming 'data' is your DataFrame with the 'Longitude' and 'Latitude'
ax.scatter(data['Longitude'], data['Latitude'], color='red', s=50, edgecolor='k', zorder=5)

# Optionally add labels for each point
for i, row in data.iterrows():
    ax.text(row['Longitude'] + 0.02, row['Latitude'] + 0.02, row['County'], fontsize=12)

# Set the title
ax.set_title('County Coordinates in Minnesota')

# Show the plot
plt.show()

In [115]:

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import cartopy.io.shapereader as shpreader

# Path to the Natural Earth shapefile
shapefile_path = r'G:\My Drive\Python_projects\my_git_pages_website\Py-and-Sky-Labs\content\Python Examples\Data\US_County_borders\ne_10m_admin_2_counties.shp'

# Initialize the figure and axes for the plots
fig, ax = plt.subplots(figsize=(10, 15), subplot_kw={'projection': ccrs.PlateCarree()})

# Add built-in Cartopy features
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')

# Load the shapefile and filter for counties in Minnesota
reader = shpreader.Reader(shapefile_path)
minnesota_counties = [county for county in reader.records() if county.attributes['REGION'] == 'MN']

# Plot only the filtered counties
for county in minnesota_counties:
    geometry = county.geometry
    name = county.attributes['NAME']
    ax.add_geometries([geometry], ccrs.PlateCarree(), edgecolor='black', facecolor='none')
    x, y = geometry.centroid.x, geometry.centroid.y
    ax.text(x, y, name, fontsize=9, ha='center', transform=ccrs.Geodetic())

# Limit the map extent to Minnesota
ax.set_extent([-97.5, -89.5, 43.5, 49.5], crs=ccrs.PlateCarree())  # Adjust these values based on the actual coordinates of Minnesota

# Plot the data points derived from the geocoded lat lon coordinates
ax.scatter(data['Longitude'], data['Latitude'], color='red', s=50, edgecolor='k', alpha=0.5, zorder=1)

# Set the title
ax.set_title('County Coordinates in Minnesota')

# Show the plot
plt.show()

In [ ]:

mn_counties

`.to_dict()`¶

In [116]:

# # Convert Filtered_top_bot_data to a dictionary mapping countries to life expectancy
# life_expectancy = Filtered_top_bot_data.set_index('country')['lifeExp'].to_dict()
# print(life_expectancy)

Plot using `.items()`¶

In [117]:

# Plot each country's coordinates
# Assuming `top_countries` and `bottom_countries` are lists of country names
# for country, (lat, lon) in coordinates.items():
#     if lat and lon:  # Check if lat and lon are not None
#         color = 'green' if country in top_countries else 'red'
#         plt.plot(lon, lat, marker='o', color=color, markersize=5, transform=ccrs.Geodetic())
#         plt.text(lon, lat, country, transform=ccrs.Geodetic())

# plt.title('Top and Bottom African Countries by Life Expectancy')
# plt.show()

Combine Dictionary Comprehension and iterrows() to create a dictionary based on multiple columns of a dataframe¶

In [118]:

# Extending the DataFrame with another column
data = {'Words': ["apple", "banana", "cherry"], 'Type': ["fruit", "fruit", "fruit"]}
df = pd.DataFrame(data)

# Dictionary mapping word to a tuple of (word length, type)
word_info_dict = {row['Words']: (len(row['Words']), row['Type']) for index, row in df.iterrows()}

print(word_info_dict)

{'apple': (5, 'fruit'), 'banana': (6, 'fruit'), 'cherry': (6, 'fruit')}

generator expressions¶

my_generator = (x*x for x in range(10)) for value in my_generator: print(value)

Savage, MN GIS

List and Dictionary Comprehension, Cartopy

Introduction to List Comprehensions¶

Generalized Examples:¶

Basic List Comprehension¶

Conditional List Comprehension¶

Extracting data using list comprehensions¶

Understanding DataFrame Iteration with `iterrows()`¶

Example:¶

Data cleaning with `iterrows()`¶

Multi-column conditional flagging of rows with `itterows()`¶

Data transformation with `itterows()`¶

Multi-column Conditional Flagging or Computation with `itterows()`¶

Mark specific rows with `itterows()`¶

Combine List Comprehension and iterrows() to extract a specific list from a dataframe¶

Introduction to Dictionary Comprehensions¶

Generalized Examples:¶

Basic Dictionary Comprehension¶

Conditional Dicionary Comprehension¶

Using Functions in Dictionary Comprehension¶

Using a dataframe in a Dictionary Comprehension¶

Introduction to Geocoding with Nominatim via Geopy¶

`.to_dict()`¶

Plot using `.items()`¶

Combine Dictionary Comprehension and iterrows() to create a dictionary based on multiple columns of a dataframe¶

generator expressions¶

Introduction to List Comprehensions¶

Generalized Examples:¶

Basic List Comprehension¶

Conditional List Comprehension¶

Extracting data using list comprehensions¶

Understanding DataFrame Iteration with iterrows()¶

Example:¶

Data cleaning with iterrows()¶

Multi-column conditional flagging of rows with itterows()¶

Data transformation with itterows()¶

Multi-column Conditional Flagging or Computation with itterows()¶

Mark specific rows with itterows()¶

Combine List Comprehension and iterrows() to extract a specific list from a dataframe¶

Introduction to Dictionary Comprehensions¶

Generalized Examples:¶

Basic Dictionary Comprehension¶

Conditional Dicionary Comprehension¶

Using Functions in Dictionary Comprehension¶

Using a dataframe in a Dictionary Comprehension¶

Introduction to Geocoding with Nominatim via Geopy¶

.to_dict()¶

Plot using .items()¶

Combine Dictionary Comprehension and iterrows() to create a dictionary based on multiple columns of a dataframe¶

generator expressions¶

links

social

Understanding DataFrame Iteration with `iterrows()`¶

Data cleaning with `iterrows()`¶

Multi-column conditional flagging of rows with `itterows()`¶

Data transformation with `itterows()`¶

Multi-column Conditional Flagging or Computation with `itterows()`¶

Mark specific rows with `itterows()`¶

`.to_dict()`¶

Plot using `.items()`¶