Every CRE data conversation eventually arrives at the same frustrating truth: no single source has the complete picture. Analysts who rely exclusively on one data category — whether that is a subscription database, public county records, or aggregated MLS feeds — are working with a systematically incomplete view of their market. The practical response is not to pick the "best" source; it is to understand what each source covers, where it has structural gaps, and how to cross-reference across sources to close those gaps.
This piece is a working comparison of three primary data categories used in CRE analytics: commercial database platforms (of which CoStar is the dominant example), public deed and assessor records, and aggregated MLS transaction data. The goal is not to rank them — that framing is too simple — but to map the coverage, lag, and blind spots of each so you can make informed decisions about how to combine them.
Commercial Database Platforms: Coverage Depth and Its Cost
CoStar is the largest and most widely used commercial property database in North America. Its core value proposition is data depth: building-level records with size, year built, occupancy, available space, historical lease transactions, and comparable sale prices with per-SF attribution. For institutional-grade office, industrial, retail, and multifamily assets in major metro markets, CoStar's coverage is genuinely comprehensive — particularly for properties that have been actively marketed through brokerage channels.
The gaps in commercial database platforms are structural, not accidental. First, off-market transactions are systematically underrepresented. When a sale occurs through a direct relationship between buyer and seller — without a listing broker who inputs the deal — that transaction may appear in the database only after it is reconstructed from public deed records, often with a lag of weeks to months and sometimes with incomplete data fields. In the Southeast mid-market (transactions below $5–10M), a meaningful share of deals move off-market. That is the portion of the transaction universe that CoStar-only analysis misses most consistently.
Second, lease transaction data in commercial databases is broker-reported. A landlord and tenant who negotiate a direct renewal without broker involvement have no incentive to report the transaction. This creates a coverage gap in lease comparable data that is hard to quantify but directionally consistent: the effective rent achieved in direct renewals — where tenants often negotiate concessions that are not visible in the broker-reported comparable — is underrepresented in the database. For markets with high direct-renewal rates (suburban office with large anchor tenants, for example), this gap can materially affect asking-rent versus effective-rent assumptions.
Third, commercial database pricing reflects what was paid, not what the current market would pay. This is not a flaw — it is how transaction data works — but it means that in rapidly shifting markets, the database lag between when a deal closes, when it enters the system, and when an analyst queries it can be 60–120 days. In a market where cap rates are moving 50–75 basis points per quarter, that lag has real consequences for current-market valuations.
Public Deed and Assessor Records: The Ground Truth Layer
Every deed transfer in the United States is recorded with the county clerk of court or register of deeds, creating a public record that is not subject to broker reporting incentives, database platform coverage gaps, or selection bias. For every transaction that closes — whether brokered or off-market, institutional or private individual, arm's length or related-party — there is a recorded deed. In most North Carolina counties, including Mecklenburg, these records are accessible within a few weeks of closing and contain the grantor/grantee names, legal description, instrument date, and in most cases the sale consideration (transfer price).
This makes public deed records the most complete source for transaction coverage. An analyst who systematically monitors deed transfers in a target submarket will capture transactions that never appear in commercial databases or MLS systems. This is particularly valuable for: identifying off-market buyers (who are often the most active acquisitioners in a market), tracking land and smaller commercial transactions below the reporting thresholds of institutional brokerages, and monitoring portfolio-level transfers or entity-to-entity conveyances that would otherwise be invisible until disclosed in corporate filings.
The limitations of public records are real, however. Raw deed data does not come annotated. The consideration amount is recorded, but not the property's square footage, occupancy at time of sale, building class, or any of the contextual data that makes a transaction useful as a comparable. Translating a deed transfer into an actionable comparable requires matching it against assessor records (for parcel characteristics), building permits (for construction history), and in some cases direct research to determine the condition of the transaction. That enrichment process is laborious at scale and is where most of the data infrastructure investment in proptech is focused.
Assessor data adds the parcel-characteristic layer: lot size, building area, construction year, use classification, and assessed value history. In active markets with regular reassessment cycles, assessor data is reasonably current on physical characteristics. In markets with infrequent reassessment, assessed values can diverge significantly from market values, and the assessor's use classification may not reflect an asset's current economic use.
Aggregated MLS Data: Commercial Reach, Residential Heritage
MLS (Multiple Listing Service) platforms were built for residential real estate. Commercial properties — particularly smaller income properties and owner-occupied commercial — have historically been listed on residential MLS platforms in markets where dedicated commercial listing platforms lack critical mass. Aggregated MLS data therefore contains a meaningful volume of commercial transactions, particularly in the $500K–$3M range that is below the floor of institutional brokerage attention.
The coverage strength of aggregated MLS data is in this mid-market tier: retail condominiums, small multi-tenant retail, owner-occupied office and flex, and smaller multifamily (2–20 units in many markets). For these asset types, MLS-aggregated data may capture a higher transaction volume than commercial database platforms, because the listing activity flows through residential brokerage networks that are well-represented in MLS systems.
The limitations are significant for institutional-grade analysis. MLS data does not contain cap rate, NOI, or lease information — it is primarily price and physical-characteristics data. MLS descriptions are agent-written and not subject to the standardization of commercial database platforms, which means use-class categorization and property descriptions require interpretation. Larger commercial transactions are rarely listed on residential MLS, and when they are, the data quality is often lower than in dedicated commercial databases.
We are not saying MLS data is unreliable for commercial use — it is a meaningful data source for mid-market transaction analysis and, when combined with public records, significantly improves coverage in the $500K–$5M transaction range. What we are saying is that MLS data alone, without public records cross-referencing, misses a substantial share of commercial transactions that were never listed — and that treating MLS data as a complete market picture for institutional CRE underwriting will produce systematically biased comp sets weighted toward marketed transactions.
The Transactions That Fall Through the Cracks
Understanding the structural blind spots of each data category allows you to identify the types of transactions most likely to be missing from any single-source analysis — and therefore the types of market intelligence most likely to appear in cross-source work.
Portfolio transfers and entity-level transactions are frequently invisible in commercial databases and MLS but are captured in deed records. When an ownership entity is restructured and assets transfer between related entities, the deed record captures the conveyance. If it is a zero-consideration transfer for estate or organizational purposes, it is flagged as non-arm's length. But if it is an arm's length portfolio sale structured as an entity transfer rather than a property transfer, it may never appear as a transaction in the brokerage data universe — and yet it is a legitimate market price signal.
Below-threshold transactions — deals below the reporting floors that institutional brokerage focuses on — are a consistent blind spot for commercial database coverage and are irregularly represented in MLS data. In Southeast markets with active regional investor bases, transactions in the $1–5M range represent a substantial share of total transaction volume by count. These deals are consistently captured in deed records and sometimes in MLS, but underrepresented in commercial database platforms relative to their actual market frequency.
New construction sales and land transactions deserve specific mention. Assessor records and deed transfers capture these reliably; commercial databases may lag or underrepresent them, particularly for owner-user new construction where no brokerage transaction occurred. For markets with active land assembly activity — relevant for entitlement analysis — cross-referencing deed transfer patterns against assessor parcel maps is the only reliable way to identify assembly behavior before a rezoning petition is filed.
A Practical Cross-Source Workflow
The effective approach is to use each source for what it does best, in sequence. Commercial database platforms provide the enriched context layer: building characteristics, historical lease data, cap rate evidence for marketed transactions, and submarket classification that structures the analysis. Public deed records provide coverage completeness: every transaction in the county, regardless of how it was sourced, with legal parcel identification that anchors the record to the assessor's parcel map. Aggregated MLS data fills the mid-market gap for smaller income properties and provides additional market velocity indicators (DOM, list-to-close ratio) that commercial databases do not consistently track.
Cross-referencing these sources reveals transactions that each source alone would miss. A deed transfer that does not appear in the commercial database may be an off-market deal by an active buyer — a signal worth investigating. A property that appears in both deed records and MLS at different price levels may indicate a distressed sale that MLS captured but commercial database enriched with occupancy context. An assessor record showing a use-class change — from vacant land to improved commercial — may precede a transaction in the commercial database by 6–12 months.
The data infrastructure required to do this systematically is real, and it is one reason why parcel-level analytics tools that integrate across source categories add value versus manual research. The coverage and lag differentials between sources are not trivially small; they represent a meaningful information advantage for analysts who systematically close them versus those who rely on a single subscription database as their complete market view.