Skip to content
×
Pro Members Get
Full Access!
Get off the sidelines and take action in real estate investing with BiggerPockets Pro. Our comprehensive suite of tools and resources minimize mistakes, support informed decisions, and propel you to success.
Advanced networking features
Market and Deal Finder tools
Property analysis calculators
Landlord Command Center
ANNUAL Save 54%
$32.50 /mo
$390 billed annualy
MONTHLY
$69 /mo
billed monthly
7 day free trial. Cancel anytime
×
Try Pro Features for Free
Start your 7 day free trial. Pick markets, find deals, analyze and manage properties.
All Forum Categories
All Forum Categories
Followed Discussions
Followed Categories
Followed People
Followed Locations
Market News & Data
General Info
Real Estate Strategies
Landlording & Rental Properties
Real Estate Professionals
Financial, Tax, & Legal
Real Estate Classifieds
Reviews & Feedback

All Forum Posts by: David D.

David D. has started 1 posts and replied 19 times.

Quote from @Austin Bright:
Quote from @David D.:

Oh also, I think Redfin gives off market homes an mls id anyway, it should have whatever home you are interested in


Is the MLS ID the number after the last "/" in this url? I can use python to get this, i bet I can generate a formula in power query to do the rest and not use a scraper at all

https://www.redfin.com/TX/Fort-Worth/10725-Lone-Pine-Ln-7610...



That's actually the property ID, it works similarly to an MLS ID but is specific to Redfin to keep track of the properties uniquely. But yeah if you can grab the data with power query and are comfortable using it that works!

Oh also, I think Redfin gives off market homes an mls id anyway, it should have whatever home you are interested in

Post: Using a predictive model to find undervalued properties.

David D.Posted
  • New to Real Estate
  • Posts 19
  • Votes 3
Quote from @Jonathan Greene:
Quote from @David D.:
Quote from @Jonathan Greene:

There are companies with billions of dollars in war chests doing this all the time. There is no reason to think that this will help you invest in real estate. You are basically trying to take the real estate out of real estate.

Predictive analytics are best used as guides to the work. You sound (I could be wrong, but probably not) like you want to use your background to eliminate some of the guesswork, but real estate investing is learned by being at properties, talking with other investors, going to meetups, etc.

Predictive data for investors is wholly flawed because it can't see inside of the houses.

This makes sense. Basically what I'm doing is saying "can a simple model and strategy get me 70% or more of the profits that a very experienced investor with 20 years of experience would get?". The idea that I'd need to know the internals of the home makes a lot of sense. You might think that 50% or more of the time I'll lose money on the investment because the model doesn't explicitly include those. However, then you could just pull a bunch of floorplans/wiring diagrams and photos of the home from an MLS and use those as features to predict price.

I’m also not 100% sure that model would be better than a “no internals” model, especially if the no internals one uses neighborhood points of interest and other geospatial stuff.

also, I just heard the latest episode of your podcast and it had some helpful flipping advice! The strategy here I think would be to just do a cosmetic remodel to a home that is highly unlikely to need any other improvements. Since I’m looking at apartments mostly, that also makes things kind of easier.


This is a similar model to Open Door or the other companies that bought a ton based solely on data and analytic,s and not real estate acumen. They have lost more money than anyone can count. The model will seem like it will work, but what happens is that a small market shift tanks the entire algorithm in one month and then you can't sell anything.


 My hope is that I would avoid this by being selective about the properties and not just having a "set it and forget it" algorithm. That is, I would let the model choose a couple of nice properties for me, and then actually see them to assess them with an expert. I figure Open Door just chose massive bundles of properties and set up a bunch of automated processes for buying, but I don't know. 

Quote from @Austin Bright:
Quote from @David D.:
Quote from @Austin Bright:
Quote from @David D.:

I’ve been using ATTOM api and they have a free trial, they’re really generous with a free trial of everything for 30 days. You can ask a generative AI to write the code to pull and save the data to csv and it’ll write pretty efficient code


I looked into them. They are expensive. My goal realistically is to do custom lead mining on fiver to of set costs on my own marketing. I’d have to keep costs low for it to work. Right now I pull county data into excel for free and upload that into deal machine. They offer free skip tracing. Cheapest marketing model I’ve put together. I can send 36,000 texts and be all in for a little under a grand a month. Offer custom lead mining would help me boost revenue because deal volume has fallen quite a bit.

Johns scraper option looks really interesting though I’ve never heard of the site so can’t vouch for it. However I have used some of apifys scrapers (for non real estate uses) and was highly satisfied with the value for the price: https://apify.com/tri_angle/redfin-search
It looks like it’s for listed properties only? It looks like I need an mls Id based on sample code

 So I haven't tested this code, but it should work (this is a very roundabout way to do it. The fast way is to just not use a Redfin scraper and find another apify scraper in the apify store like a Zillow scraper that just takes the addresses in the format you have them in already).

Instead of the list of addresses in the array, replace it with your actual addresses. Ask the gen AI you're using how to extract the addresses from your excel spreadsheet or whatever they're in and put them in a python array.

This code will take the address and reformat it into a Redfin Longitude Latitude search for an address. If the coordinates are precise enough, it should find the home and get its MLS ID, then feed that ID using Python's requests library into an API request to the scraper on apify.com. However, You might need to use the actual python client for apify.com to get this to work.

To run python code quickly, you can use Google Colaboratory, which hosts jupyter notebooks for you for free and saves them to your google drive if you have one. An alternative is something called Binder. 

To get the property data formatted nicely into a "data frame" or a standard tabular dataset that you can export to CSV or excel, you can use the json_normalize function on the json data output you get in the code cell when you run it. I can make a post covering that if the Gen AI doesn't explain it well. 


"""
from geopy.geocoders import Nominatim
import requests
from bs4 import BeautifulSoup
import json

APIFY_API_TOKEN = 'YOUR_APIFY_API_TOKEN' # Replace with your Apify API token

def geocode_address(address):
geolocator = Nominatim(user_agent="redfin_scraper")
location = geolocator.geocode(address)
if location:
return location.latitude, location.longitude
return None, None

def search_redfin(lat, lng):
search_url = f"https://www.redfin.com/stingray/do/location-autocomplete?location={lat},{lng}"
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(search_url, headers=headers)
if response.status_code == 200:
return response.json()
return None

def extract_home_id(search_results):
for result in search_results['payload']:
if 'home' in result:
return result['home']['id']
return None

def construct_redfin_detail_url(state, city, street, zip_code, home_id):
street_formatted = street.replace(' ', '-')
city_formatted = city.replace(' ', '-')
return f"
def" class="redactor-autoparser-object">https://www.redfin.com/{state}/{city_formatted}/{street_form...
query_apify_redfin_scraper(detail_url):
api_url = "https://api.apify.com/v2/acts/tri_angle~redfin-detail/runs?token=" + APIFY_API_TOKEN
payload = {
"detailUrls": [{"url": detail_url}],
"debugLog": False
}
headers = {
"Content-Type": "application/json"
}
response = requests.post(api_url, headers=headers, data=json.dumps(payload))
if response.status_code == 201:
run_id = response.json()['data']['id']
return run_id
return None

def get_apify_scraper_results(run_id):
api_url = f"https://api.apify.com/v2/acts/tri_angle~redfin-detail/runs/{run_id}/dataset/items?token=" + APIFY_API_TOKEN
while True:
response = requests.get(api_url)
if response.status_code == 200:
data = response.json()
if data:
return data
time.sleep(10) # Wait for 10 seconds before retrying
return None

def main(addresses):
results = []
for address in addresses:
lat, lng = geocode_address(address)
if lat and lng:
search_results = search_redfin(lat, lng)
if search_results:
home_id = extract_home_id(search_results)
if home_id:
parts = address.split(',')
street = parts[0].strip()
city = parts[1].strip()
state_zip = parts[2].strip().split(' ')
state = state_zip[0]
zip_code = state_zip[1]
detail_url = construct_redfin_detail_url(state, city, street, zip_code, home_id)
run_id = query_apify_redfin_scraper(detail_url)
if run_id:
scraper_results = get_apify_scraper_results(run_id)
results.append(scraper_results)
return results

addresses = [
"123 Main St, Los Angeles, CA, 90001",
"456 Elm St, San Francisco, CA, 94102"
# Add more addresses here
]

detail_urls = main(addresses)
print(json.dumps(detail_urls, indent=2))
"""

Quote from @Austin Bright:
Quote from @David D.:

I’ve been using ATTOM api and they have a free trial, they’re really generous with a free trial of everything for 30 days. You can ask a generative AI to write the code to pull and save the data to csv and it’ll write pretty efficient code


I looked into them. They are expensive. My goal realistically is to do custom lead mining on fiver to of set costs on my own marketing. I’d have to keep costs low for it to work. Right now I pull county data into excel for free and upload that into deal machine. They offer free skip tracing. Cheapest marketing model I’ve put together. I can send 36,000 texts and be all in for a little under a grand a month. Offer custom lead mining would help me boost revenue because deal volume has fallen quite a bit.

Johns scraper option looks really interesting though I’ve never heard of the site so can’t vouch for it. However I have used some of apifys scrapers (for non real estate uses) and was highly satisfied with the value for the price: https://apify.com/tri_angle/redfin-search

I’ve been using ATTOM api and they have a free trial, they’re really generous with a free trial of everything for 30 days. You can ask a generative AI to write the code to pull and save the data to csv and it’ll write pretty efficient code

Post: Using a predictive model to find undervalued properties.

David D.Posted
  • New to Real Estate
  • Posts 19
  • Votes 3
Quote from @Jonathan Greene:

There are companies with billions of dollars in war chests doing this all the time. There is no reason to think that this will help you invest in real estate. You are basically trying to take the real estate out of real estate.

Predictive analytics are best used as guides to the work. You sound (I could be wrong, but probably not) like you want to use your background to eliminate some of the guesswork, but real estate investing is learned by being at properties, talking with other investors, going to meetups, etc.

Predictive data for investors is wholly flawed because it can't see inside of the houses.

This makes sense. Basically what I'm doing is saying "can a simple model and strategy get me 70% or more of the profits that a very experienced investor with 20 years of experience would get?". The idea that I'd need to know the internals of the home makes a lot of sense. You might think that 50% or more of the time I'll lose money on the investment because the model doesn't explicitly include those. However, then you could just pull a bunch of floorplans/wiring diagrams and photos of the home from an MLS and use those as features to predict price.

I’m also not 100% sure that model would be better than a “no internals” model, especially if the no internals one uses neighborhood points of interest and other geospatial stuff.

also, I just heard the latest episode of your podcast and it had some helpful flipping advice! The strategy here I think would be to just do a cosmetic remodel to a home that is highly unlikely to need any other improvements. Since I’m looking at apartments mostly, that also makes things kind of easier.

Post: Using a predictive model to find undervalued properties.

David D.Posted
  • New to Real Estate
  • Posts 19
  • Votes 3
Quote from @Drew Sygit:

@David D. there are challenges with what you want to do, specifically in the City of Detroit:

1) You'd have to do this for each of the city's 185 or so Neighborhoods. Zip codes are too big.

2) You'd have to factor in Neighborhood population density somehow. Not aware of a reliable source for this info - and please don't think US Census Bureau is the answer.

3) How do you account for "property condition"?


1) Here's one neighborhood level strategy. Take 48201 (midtown Detroit). I could divide up into neighborhoods and take the homes with the greatest difference between predicted and actual values in each of the neighborhood level models. I'm not sure why this would be better than the original strategy though. I'm skeptical you'd get enough neighborhood-level sales.

 2) I just found out ATTOM has 4 levels of neighborhood data, even down to residential subdivision, and pop density for each! https://www.attomdata.com/data/boundaries-data/area-neighbor... . Excited to try modeling at this level (they also do 5 year projections).

3) One strategy might be to do the same thing we just did with sales, but for property condition. I plug in known properties, fit a curve as in sales (baths, pop density, and so on), and try to predict the property's condition based on those features within an error rate (except this time you'd be predicting the probability of a condition, as opposed to sales price). Another simple idea is to just have an expert assess the condition of a few of the best properties you find with the original strategy. 

Another idea I had was to just use standard research out there on building systems like the air conditioning, wiring, plumbing and so on, to estimate how much time they would normally take to require repairs.

Here's a paper that discusses "energy modeling" for single family homes that is pretty recent that is an example of what I'm thinking of:

https://www.mdpi.com/1996-1073/12/8/1537

Post: Using a predictive model to find undervalued properties.

David D.Posted
  • New to Real Estate
  • Posts 19
  • Votes 3

I am curious if anyone has employed the simple strategy I was thinking of for investing in any kind of property:

1. Use something like ATTOM API to get historical sales snapshots, or just all the historical sales of the property class for a metropolitan area. 
2. Train a regression model using square footage, bathroom count, or more advanced features to predict the price of the sales. 
3. Get the error of your predicted housing price (if you're using price per square foot, convert to housing price) down to $10K-$40K or so. 
4. Find properties with characteristics that predict they should be selling for 2 standard deviations above what their actual price/value is (or basically just properties that are selling for way below what the model predicts they should sell for).

The challenge here is of course step 3. I just started experimenting with Downtown Detroit with ATTOM's "sales snapshots" and I haven't gotten that kind of performance yet. However, I'm sure many people have, especially if they have images or floor plans of the houses/other features and lots of data. 

I apologize if this is all just fairly standard financial modeling. Very key thing here, I don't have a finance/economics background, and am just getting started in this area, but this struck me as a good strategy to use. Also looking at neighborhood trends. My worry is that this wouldn't work because any undervalued property has its price for a "reason", e.g. it is already at its equilibrium price by the time you've determined that its "undervalued", and in fact you won't be able to flip it for whatever reason for a huge profit.

I don't understand how that could be the case though. What seems more intuitive for me is that these are the homes that just happen to have so far been ignored by other investors, or else passed over for better deals, since a small group of investors cannot snatch up all of the deals (otherwise why would this forum exist haha).

Another argument might be that you can never tell when a neighborhood is going to crash in its property values, or when people will migrate to a nearby one that is flourishing. If this were the case, then your model wouldn't be effective anyway. Presumably you can also protect against this with the aforementioned neighborhood growth trends.