

Final project for my Problems in Geoscience class - Project Proposal
Title:
Location-allocation analysis – Using network analysis to site a new café in Duluth, MN
Group Members:
Sam Arcand, Ben Fraser
Appeal:
Network analytics, especially location-allocation, is a very powerful tool and we think that anyone with deeper knowledge of how it works and how to use it would be well positioned on the road to real estate investing success. Also, it provides an opportunity to use SQL to create new, original data by joining existing relations in novel ways;
- Our goal is to site a café in Duluth, MN and we need to the know the property values. In reviewing the data we downloaded from St. Louis County, MN, we became curious about low property values (“EstTotalVa”) of $0-1000. It turns out that these properties were owned, mostly, by the State of MN and the City of Duluth. In order to find them all, we wrote this SQL command:
SELECT * FROM Parcels_St_Louis_County_MN WHERE "PHYSCITY" = ' ' AND "EstTotalVa" <1000
Note: It was necessary to search for "PHYSCITY" = ' ' because some of the parcels were left blank in the PHYSCITY column.
- We then used the Duluth municipal boundary shapefile to clip the 11,666 results from this selection, since we are only interested in Duluth property values. We performed a kriging analysis to determine that a number of State-owned and City-owned properties are significantly undervalued in the dataset. We accomplished the kriging, which is a form of Bayesian interpolation by using the actual values of surrounding properties that were not owned by the City of Duluth or the State of Minnesota. It is not in itself surprising that municipalities, nor the State, do not assess taxes on themselves. However, in the unlikely event that the State of Minnesota or the City of Duluth decide to sell any properties, we now have the correct (+/-) values from which to begin a fair negotiation.
SQL is also very powerful tool for adding to databases and maintaining them. The combination of location specific data and optimal location-allocation modeling, together with the data joining potential of SQL should allow us to improve on existing datasets. The end results will have real-world value to real estate professionals and café retailers. Using network analytics methods with SQL allows us to apply multiple concepts learned in class in more detail, while also providing useful and tangible results.
Study Area:
The site of interest for this project is Duluth, MN (Figure 1., below) located in southern St. Louis County on the north shore of Lake Superior. The population is ~86,000 per the 2015 census estimates. Duluth and Superior together form what is regionally known as the Twin Ports and they account for the largest such facility in Great Lakes. As a result, the backbone of the local economy is centered around the port industry. Duluth has a bustling tourist industry given its unique location along Lake Superior and its access to many attractions located to the north along Highway 61. Duluth is also home to multiple universities including the University of Minnesota – Duluth and the College of St. Scholastica. The lively tourist industry, strong local economy, and large student population make Duluth, MN an ideal location for an entrepreneur to consider for a new café.
Data:
The State of Minnesota has an open data policy. We have obtained georeferenced parcel data that includes property assessment values and functional data (road network, incl. city, county, rural, highways, and interstates). Georeferenced census population data has been downloaded from the TIGER/Line database through United States Census Bureau. For use with location-allocation modeling road speeds for routing will be determined via the MN-DOT road classification system. Point Data on location data for all existing cafes in Duluth will also be used for location-allocation modeling
Figure 1: Map of Duluth, MN. Image obtained via Google Maps.
Description of Project:
As discussed above, we have St. Louis county-level parcel data. We also have census tract-level and blockgroup-level data from the US Census. The parcel data is complete, georeferenced parcel-based property assessment data for use in establishing current market prices for real estate in the Duluth area of St. Louis county, MN, specifically the selling and asking prices of retail properties that are appropriate for a newly-opened café with a decided focus on the morning coffee and tea consumer. The census data is complete, georeferenced demographic data of St. Louis county. Because of its fine detail, the demographic data is suitable for performing spatial analyses. The most relevant analysis would be an assessment of community growth in population, development within existing industries, development of new industries, and the spatial distribution of mean and median incomes across the county population, again with a focus on coffee and tea consumers during the morning hours.
These analyses will need to be augmented with a location-allocation study to model automobile travel times from the mean/median income clusters with an impedance factor of 15 minutes. In other words, we will find out what locations on the road network are within 15 minute travel time from our target demographic, which will be informed by the slightly coarse speed limit data obtained from the Minnesota Department of Transportation. Demographic clustering among the census tracts and blockgroups may or may not exist. Knowing the mean and median income of each census tract, we will have to join the demographic data to the location allocation results in order to find the ideal café locations based on the demographics.
It is unclear at the outset whether time and distance measures from the edge of each census tract boundary or from their centroids (the geographic center points) is more advantageous in terms of generating predictive power while minimizing error propagation. Therefore, we aim to apply both methodologies separately, using the Boolean ‘AND’ operator to intersect the allocation values present in each census tract with the demographic attributes of each census tract, thereby assigning the time and distance attributes of all the travel nodes within St. Louis county to the specific census tract in which they occur. The difference between the base ranking (1-5) proposed by the location allocation can then be compared with the ranks based solely on the parcel values, solely on the census tract mean/median income attributes, and the join of the parcel values and census tract income attributes for triangulation purposes.
Comments