Quote from @Austin Bright:
Quote from @David D.:
Quote from @Austin Bright:
Quote from @David D.:
I’ve been using ATTOM api and they have a free trial, they’re really generous with a free trial of everything for 30 days. You can ask a generative AI to write the code to pull and save the data to csv and it’ll write pretty efficient code
I looked into them. They are expensive. My goal realistically is to do custom lead mining on fiver to of set costs on my own marketing. I’d have to keep costs low for it to work. Right now I pull county data into excel for free and upload that into deal machine. They offer free skip tracing. Cheapest marketing model I’ve put together. I can send 36,000 texts and be all in for a little under a grand a month. Offer custom lead mining would help me boost revenue because deal volume has fallen quite a bit.
Johns scraper option looks really interesting though I’ve never heard of the site so can’t vouch for it. However I have used some of apifys scrapers (for non real estate uses) and was highly satisfied with the value for the price:
https://apify.com/tri_angle/redfin-search It looks like it’s for listed properties only? It looks like I need an mls Id based on sample code
So I haven't tested this code, but it should work (this is a very roundabout way to do it. The fast way is to just not use a Redfin scraper and find another apify scraper in the apify store like a Zillow scraper that just takes the addresses in the format you have them in already).
Instead of the list of addresses in the array, replace it with your actual addresses. Ask the gen AI you're using how to extract the addresses from your excel spreadsheet or whatever they're in and put them in a python array.
This code will take the address and reformat it into a Redfin Longitude Latitude search for an address. If the coordinates are precise enough, it should find the home and get its MLS ID, then feed that ID using Python's requests library into an API request to the scraper on apify.com. However, You might need to use the actual python client for apify.com to get this to work.
To run python code quickly, you can use Google Colaboratory, which hosts jupyter notebooks for you for free and saves them to your google drive if you have one. An alternative is something called Binder.
To get the property data formatted nicely into a "data frame" or a standard tabular dataset that you can export to CSV or excel, you can use the json_normalize function on the json data output you get in the code cell when you run it. I can make a post covering that if the Gen AI doesn't explain it well.
"""
from geopy.geocoders import Nominatim
import requests
from bs4 import BeautifulSoup
import json
APIFY_API_TOKEN = 'YOUR_APIFY_API_TOKEN' # Replace with your Apify API token
def geocode_address(address):
geolocator = Nominatim(user_agent="redfin_scraper")
location = geolocator.geocode(address)
if location:
return location.latitude, location.longitude
return None, None
def search_redfin(lat, lng):
search_url = f"https://www.redfin.com/stingray/do/location-autocomplete?location={lat},{lng}"
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(search_url, headers=headers)
if response.status_code == 200:
return response.json()
return None
def extract_home_id(search_results):
for result in search_results['payload']:
if 'home' in result:
return result['home']['id']
return None
def construct_redfin_detail_url(state, city, street, zip_code, home_id):
street_formatted = street.replace(' ', '-')
city_formatted = city.replace(' ', '-')
return f"
def" class="redactor-autoparser-object">https://www.redfin.com/{state}/{city_formatted}/{street_form... query_apify_redfin_scraper(detail_url):
api_url = "https://api.apify.com/v2/acts/tri_angle~redfin-detail/runs?token=" + APIFY_API_TOKEN
payload = {
"detailUrls": [{"url": detail_url}],
"debugLog": False
}
headers = {
"Content-Type": "application/json"
}
response = requests.post(api_url, headers=headers, data=json.dumps(payload))
if response.status_code == 201:
run_id = response.json()['data']['id']
return run_id
return None
def get_apify_scraper_results(run_id):
api_url = f"https://api.apify.com/v2/acts/tri_angle~redfin-detail/runs/{run_id}/dataset/items?token=" + APIFY_API_TOKEN
while True:
response = requests.get(api_url)
if response.status_code == 200:
data = response.json()
if data:
return data
time.sleep(10) # Wait for 10 seconds before retrying
return None
def main(addresses):
results = []
for address in addresses:
lat, lng = geocode_address(address)
if lat and lng:
search_results = search_redfin(lat, lng)
if search_results:
home_id = extract_home_id(search_results)
if home_id:
parts = address.split(',')
street = parts[0].strip()
city = parts[1].strip()
state_zip = parts[2].strip().split(' ')
state = state_zip[0]
zip_code = state_zip[1]
detail_url = construct_redfin_detail_url(state, city, street, zip_code, home_id)
run_id = query_apify_redfin_scraper(detail_url)
if run_id:
scraper_results = get_apify_scraper_results(run_id)
results.append(scraper_results)
return results
addresses = [
"123 Main St, Los Angeles, CA, 90001",
"456 Elm St, San Francisco, CA, 94102"
# Add more addresses here
]
detail_urls = main(addresses)
print(json.dumps(detail_urls, indent=2))
"""