This is the beginning of an attempt to address some questions raised on iTulip about the accuracy and timeliness of reported housing data. I invite others to contribute, but I'll try to get us off to a decent start...
Housing data sources:
Multiple Listing Services.
Public Records
Home builders
Median prices:
In an earlier post I quoted a California Association of Realtors report that stated there was an 8% increase in median prices, year over year, in May. To be more precise that means that the median of the homes actually sold was higher the median of the homes actually sold the year before. Note that this is a completely different set of homes, being purchased under different circumstances (at least a very different interest rate environment), so I'm not sure what this really tells us. I personally think it is too big a jump to think that prices increased 8%. I think a more likely explanation is that there is now more volume occurring in more expensive, urban, areas, and less volume in the more speculative, less expensive, outlying areas. Also it could be that a disproportional percentage of the associated 21% decline in sales volume was first time home buyers, which tend to buy less expensive homes. Other possibilities abound. Without this additional analysis, I think most reported data regarding month over month and especially year over year changes in median prices is essentially worthless.
More to come when I have time... I encourage others to contribute, ask questions, etc. I will monitor this thread.
Housing data sources:
Multiple Listing Services.
This is Realtors primary tool for communicating the availability of homes for sale, and the status of those transactions. It tends to be the most up to date source, as most Realtors try to update it immediately with any status changes, and in many cases can be fined for not keeping it up to date. MLS systems tend to cover a single "market", which can vary in size a great deal, and is not consistently applied by city, county or other natural, repeatable, boundary. They tend to be run by local Realtor associations that are part of a state association and ultimately the National Association of Realtors (NAR). In California there are 58 counties, more than 100 local Realtor associations, each of which can pick its own MLS system, though some share systems. In any case the data is highly fragmented. With the advent of Realtor.com much of the local data is aggregated to this national site, but individual Realtors are allowed the choice of whether or not their data is made available publicly there, so it can not be considered comprehensive. There are still pockets, mostly small towns, like Coalinga, CA, where real estate agents choose not to use an MLS, and there are others where they maintain a closely held system that is not part of NAR. Although nothing stops builders from using the MLS to list new homes, MLS systems primarily contain existing home data. To the extent that new homes are included, I don't believe there is any way to separate them, so they tend to get reported as part of the existing home statistics. Also note that these systems are typically not used by commercial real estate brokers.
Lag times with MLS data are typically small. It may take a lazy Realtor a couple of days to update the system with a status change, but it tends to be the most up to date.
Data quality tends to be relatively high. It is the only source for current inventory of existing homes - though note that this will not include inventory where agents do not use the MLS, for sale by owner (FSBO) inventory, or pocket listings (listings that agents do not advertise in the hope of finding a buyer themselves to get a double commission). It is also the only source for pending, or in-contract, sales. Finally it tends to be a more informative source for sales data as it typically includes information about how long it took for the property to sell, sales concessions that reduced the actual price paid, and other data that can provide a lot of clues about the actual market conditions.
The lag and quality information discussed above applies at the local level. As data gets aggregated to regional, state and national levels, the lag tends to increase just due to the process, and the quality can decline as details like seller concessions gets lost. This largely depends on whose doing the aggregation and the level of care being used. The state and national association of Realtors is the primary aggregator of this data, and unfortunately I haven't looked under the covers enough to give an accurate assessment of how well they do in this regard.
Lag times with MLS data are typically small. It may take a lazy Realtor a couple of days to update the system with a status change, but it tends to be the most up to date.
Data quality tends to be relatively high. It is the only source for current inventory of existing homes - though note that this will not include inventory where agents do not use the MLS, for sale by owner (FSBO) inventory, or pocket listings (listings that agents do not advertise in the hope of finding a buyer themselves to get a double commission). It is also the only source for pending, or in-contract, sales. Finally it tends to be a more informative source for sales data as it typically includes information about how long it took for the property to sell, sales concessions that reduced the actual price paid, and other data that can provide a lot of clues about the actual market conditions.
The lag and quality information discussed above applies at the local level. As data gets aggregated to regional, state and national levels, the lag tends to increase just due to the process, and the quality can decline as details like seller concessions gets lost. This largely depends on whose doing the aggregation and the level of care being used. The state and national association of Realtors is the primary aggregator of this data, and unfortunately I haven't looked under the covers enough to give an accurate assessment of how well they do in this regard.
Public Records
When a property is sold some form of conveyance, typically a deed, is recorded in the county where the property resides. Liens, including deeds of trust and mortgages, are also recorded. This information is freely available to the public. In most, but not all, states the price paid for the property is indicated on the deed (typically because the transfer tax is included, and it is based on a percentage of price). Similarly, most liens include the amount of the lien. These records are open to anyone. Yes, anyone can easily lookup what you paid for your home, and how much you owe on it. There are some non-disclosure states, including Idaho, Indiana, Kansas, Louisiana, Mississippi, Montana, New Mexico, Texas, Utah and Wyoming, where values are not included. There are other states, like California, where it's possible not to disclose price. Public records are also the source for other important information like foreclosure and tax default notices that can indicate a lot about the housing market as well.
Technically lag time is non-existent as the property is sold (conveyed) or encumbered (lien, mortgage, etc) upon recording of the document. Practically lag times are significant and vary a great deal. Some counties index and scan documents the same day, other counties don't get documents processed for weeks. Even if the county has the document indexed and available, it is only available locally, or occasionally at a county website. And typically the county only makes limited information available, like the type of document and the name of the parties, for details you have to read the document. So there are businesses that get copies of the documents, have them scanned, and then have data of interest keyed into a database. This process usually adds at least 1 week. Almost all reporting based on public records is from these privately maintained databases. As such the lag is typically at least 2 weeks, and can be much longer.
Quality is quite good on an individual basis. If you pull a specific document you will get all the information that is required to be recorded by law, and often more. The problem lies in the fact that certain data like prices are not always required. The public records companies data entry personnel regularly make keying errors. These companies also do not key everything... for example they've only recently started keying the reset dates on adjustable mortgages, and this data is not generally available. Also not all public records include data (note the non-disclosure states above, for example). Certainly national reporting based on public records should be carefully scrutinized.
Technically lag time is non-existent as the property is sold (conveyed) or encumbered (lien, mortgage, etc) upon recording of the document. Practically lag times are significant and vary a great deal. Some counties index and scan documents the same day, other counties don't get documents processed for weeks. Even if the county has the document indexed and available, it is only available locally, or occasionally at a county website. And typically the county only makes limited information available, like the type of document and the name of the parties, for details you have to read the document. So there are businesses that get copies of the documents, have them scanned, and then have data of interest keyed into a database. This process usually adds at least 1 week. Almost all reporting based on public records is from these privately maintained databases. As such the lag is typically at least 2 weeks, and can be much longer.
Quality is quite good on an individual basis. If you pull a specific document you will get all the information that is required to be recorded by law, and often more. The problem lies in the fact that certain data like prices are not always required. The public records companies data entry personnel regularly make keying errors. These companies also do not key everything... for example they've only recently started keying the reset dates on adjustable mortgages, and this data is not generally available. Also not all public records include data (note the non-disclosure states above, for example). Certainly national reporting based on public records should be carefully scrutinized.
Home builders
I know that most new home reporting comes directly from the home builders themselves. I know very little about this process, but it seems to me to be highly susceptible to error. For example, I know a number of small builders locally that do not participate in this reporting. As I believe a reasonably high percentage (>30%?) of new construction is by small local builders, this alone seems like a significant flaw.
I imagine other sources for new home data include county assessors records, and/or building permit records. Based on my experience with both departments, these seem more susceptible to both lag and quality issues than the public records from the recorders office discussed above.
I imagine other sources for new home data include county assessors records, and/or building permit records. Based on my experience with both departments, these seem more susceptible to both lag and quality issues than the public records from the recorders office discussed above.
Median prices:
In an earlier post I quoted a California Association of Realtors report that stated there was an 8% increase in median prices, year over year, in May. To be more precise that means that the median of the homes actually sold was higher the median of the homes actually sold the year before. Note that this is a completely different set of homes, being purchased under different circumstances (at least a very different interest rate environment), so I'm not sure what this really tells us. I personally think it is too big a jump to think that prices increased 8%. I think a more likely explanation is that there is now more volume occurring in more expensive, urban, areas, and less volume in the more speculative, less expensive, outlying areas. Also it could be that a disproportional percentage of the associated 21% decline in sales volume was first time home buyers, which tend to buy less expensive homes. Other possibilities abound. Without this additional analysis, I think most reported data regarding month over month and especially year over year changes in median prices is essentially worthless.
More to come when I have time... I encourage others to contribute, ask questions, etc. I will monitor this thread.
Comment