Real Estate Technology
Market News & Data
General Info
Real Estate Strategies
![](http://bpimg.biggerpockets.com/assets/forums/sponsors/hospitable-deef083b895516ce26951b0ca48cf8f170861d742d4a4cb6cf5d19396b5eaac6.png)
Landlording & Rental Properties
Real Estate Professionals
Financial, Tax, & Legal
![](http://bpimg.biggerpockets.com/assets/forums/sponsors/equity_trust-2bcce80d03411a9e99a3cbcf4201c034562e18a3fc6eecd3fd22ecd5350c3aa5.avif)
![](http://bpimg.biggerpockets.com/assets/forums/sponsors/equity_1031_exchange-96bbcda3f8ad2d724c0ac759709c7e295979badd52e428240d6eaad5c8eff385.avif)
Real Estate Classifieds
Reviews & Feedback
Updated over 8 years ago on . Most recent reply
![Aleks Petrov's profile image](https://bpimg.biggerpockets.com/no_overlay/uploads/social_user/user_avatar/430744/1621476313-avatar-aleksp.jpg?twic=v1/output=image/cover=128x128&v=2)
Creating county Web Scrapper
Hi Biggerpockets community!
My name is Aleksei, and I’m planning :) to invest in RE.
I live in expensive area, so my steps in RE should be very careful. I’m active listener of BP video podcasts and it is excellent source of information.
While I’m learning theory, I thought what I can do with my knowledge as software developer?
I know that all data about properties is public accessible, but you cannot do search using particular filters. So I decided to write my own web scrapper and create own data base.
Picked 1 county for pilot project, and in this topic I’ll post updates about challenges in web scrapping.
Please let me know if it is going to be interesting topic, so I’ll keep posting updates.
Thanks BiggerPockets for being so awesome resource!
Best regards, Aleksei.
Most Popular Reply
![Aleks Petrov's profile image](https://bpimg.biggerpockets.com/no_overlay/uploads/social_user/user_avatar/430744/1621476313-avatar-aleksp.jpg?twic=v1/output=image/cover=128x128&v=2)
@Trevor Ewen thanks for reply!
it is good question 'what then'...
I'm planning to create full copy of DB, so I can do any query per demand.
Goal #1: get all available APN’s numbers for given county.
Goal #2: get all data on available APN’s and save on local DB
Part1:
Web site has form with 1 billion available inputs, so need to iterate and find correct numbers.
I wrote script that using web browser to input numbers and submit search. Results I decided to store in CSV file.
1 browser executing script in 2 seconds, I was able to run in 16 browsers total (2 machines x 8 browsers).
2 seconds * 1 billion / 16 browsers = 125000000 seconds = 1446 days which is not acceptable.
Next solution is to use API requests to omit browsers.
In this case I can run 1 request/response in 0.2 seconds and can execute ~10 parallel executions:
0.2 second * 1 billion / 10 = 20000000 seconds = 231 days.. much better than previous result, but still slow.
Right now I don’t have better solution..
I’ve noticed that I can search APN’s on 2 (at least) different web sites, my next step is to check if this sites using different DB (or copies) and I can double my speed by hitting 2 points.
Will keep you posted.