To show you a solid selection of living organisms you are likely to see in Ponte Vedra, I downloaded a list of species from GBIF. The list was so huge that I removed species that had been reported less than ten times. That left me with 527 species. Can you guess how many of those species were birds?
Click here to check your answer
280
Fish hawk (Osprey)
I wanted to research each species, but how long would that take me? I’m just not that into it. I looked for ways to automate at least the bulk of the research. I asked ChatGPT how I could look up the diet for 280 birds at one time. ChatGPT offered several suggestions. I chose to scrape the Wikipedia pages for those birds with a Python script. Although there are several websites with great information about birds, they aren’t as scrapable as Wikipedia. And although you can do the same thing through Google Apps Script, I’d been wanting to give Python a test run. My Intro to Programming instructor talked it up a lot but didn’t teach it to us. Here’s the script I used to scrape 280 bird pages on Wikipedia:
import pandas as pd
import requests
from bs4 import BeautifulSoup
import time
# Load bird names from CSV
bird_df = pd.read_csv('bird_names_for_diet_scraping.csv')
def get_diet_info_wikipedia(bird_name):
url = f"https://en.wikipedia.org/wiki/{bird_name.replace(' ', '_')}"
headers = {'User-Agent': 'Mozilla/5.0'}
try:
response = requests.get(url, headers=headers)
if response.status_code != 200:
return None
soup = BeautifulSoup(response.text, 'html.parser')
paragraphs = soup.find_all('p')
for p in paragraphs:
text = p.get_text().strip()
if 'diet' in text.lower() or 'feeds on' in text.lower() or 'eats' in text.lower():
return text
return None
except Exception as e:
return None
# Add diet info to DataFrame
diet_data = []
for name in bird_df['iNaturalist Common Name']:
print(f"Searching: {name}")
diet = get_diet_info_wikipedia(name)
diet_data.append(diet)
time.sleep(1.5) # Be polite to Wikipedia's servers
bird_df['Diet Description'] = diet_data
# Save results
bird_df.to_csv('bird_diets_from_wikipedia.csv', index=False)
print("Done. Results saved to bird_diets_from_wikipedia.csv.")
I loved it. It was much faster than Google Apps Script. I also like having my files on my own machine. Now that I’ve installed Python, I don’t have to skip that option in other projects.
And what did Python give me? Here’s two examples:
Great work!
The swallow-tailed kite feeds on small reptiles, such as snakes and lizards.[15] It may also feed on small amphibians such as frogs; large insects, such as grasshoppers, crickets, termites, ants, wasps, dragonflies, beetles, and caterpillars; small birds and their eggs and nestlings; and small mammals including bats.[16][17] It has also been known to prey on fish.[18] It has been observed to regularly consume fruit in Central America.[19] It drinks by skimming the surface and collecting water in its beak. The bird usually does not break flight during feeding.[8]
Oh well…
The species [Red-breasted Merganser] is widespread and common enough to be categorized as least concern by the IUCN, though populations in some areas may be declining. Threats include habitat loss through wetland destruction, exposure to toxins such as pesticides and lead, and becoming bycatch of commercial fishing operations.[9] Anglers and fish farmers have also persecuted the species, which they regard as a competitor, though the impact of this on the species’ population is not known.[10]
I did get a huge head start on my huge list of birds.