We Tested 5 AI Valuation Tools Against Actual Sale Prices. The Results Were Mixed


After years of hearing vendors quote online valuations as gospel, I decided to run an actual test.

I collected 47 recent sales across Sydney and Melbourne—properties I knew well enough to verify the sale prices. Then I ran each through five different AI valuation tools before comparing estimates to actual outcomes.

The results? Less bad than I expected, but with consistent patterns worth understanding.

The Test Setup

Sample: 47 properties sold between October and December 2025

Locations:

  • Sydney: Eastern Suburbs, Inner West, North Shore
  • Melbourne: Inner South, Inner East, Bayside

Property types: Mix of houses (28) and apartments (19)

Tools tested:

  • CoreLogic automated valuation
  • PropTrack estimate
  • Domain Home Price Guide
  • REA Group property estimate
  • One leading bank valuation tool

I ran valuations in the week before each property sold, capturing the “live” estimate rather than post-sale adjustments.

Overall Accuracy

Across all 47 properties:

ToolMedian ErrorWithin 5%Within 10%
CoreLogic4.2%57%83%
PropTrack4.8%49%79%
Domain5.1%45%77%
REA5.4%43%74%
Bank tool6.2%38%70%

A median error of 4-6% on a $1.5 million property is $60,000-90,000. Not trivial.

Where They Struggled

Renovated properties

AI tools base estimates largely on comparable sales. But “comparable” often means similar land size and bedroom count—not renovation quality.

A Hawthorn terrace had been gutted and rebuilt to architect standards. Every tool undervalued it by 12-18%. The comparables were unrenovated or dated—the models couldn’t see the difference.

Unique configurations

A Paddington semi converted to a home office with separate street entrance baffled the algorithms. Was it a house or a commercial property? The tools didn’t know. Neither did they correctly identify its appeal to buyers wanting work-from-home setups.

Recent infrastructure changes

A Hurstville apartment near a newly-completed metro station sold well above estimates. The tools hadn’t yet incorporated the access improvement into nearby valuations. Their training data predated the opening.

Premium quality in average suburbs

A well-built family home in a suburb known for modest stock consistently undervalued. The algorithms expected median quality for the postcode.

Where They Performed Well

Standard stock in established markets

Cookie-cutter apartments in well-traded buildings often came within 2-3% of sale price. Lots of recent comparables meant the models had good data.

Houses on standard blocks in homogeneous suburbs

Rows of similar houses selling regularly produced reliable estimates. The AI had exactly what it needed: plentiful, consistent data.

Properties without unusual features

The more typical a property, the better the estimate. This makes mathematical sense—models predict toward the mean.

What This Means for Agents

Lead with local knowledge on unusual properties. When a vendor waves their phone showing an AI estimate, you have data to explain why their property differs from algorithmic assumptions.

Use AI valuations as starting points, not conclusions. They’re useful for initial discussions, less useful for pricing strategy on distinct properties.

Track your market’s AI accuracy. Run your own tests. If you find consistent bias in particular property types, that’s valuable knowledge.

Educate vendors on methodology. Most people assume these tools “know” their specific property. They don’t—they know the category the property fits into.

What This Means for Buyers and Sellers

Don’t anchor on online estimates. They’re one data point, not the answer. Especially if your property has features the algorithm can’t assess.

Understand the range. Most tools show a range, not a single number. The actual value is at least as likely near the edges as the middle.

Get professional advice for significant decisions. If the estimate is driving whether you buy, sell, or refinance, get a proper valuation. CoreLogic and PropTrack both offer more detailed reports than their free estimates.

The Bigger Picture

AI valuation tools will keep improving. More data, better models, and refinements to handle edge cases will narrow the accuracy gap over time.

But they’ll never fully replace human judgment on individual properties. Real estate value ultimately depends on what a specific buyer will pay for specific features—and that’s contextual in ways algorithms struggle to capture.

The tools are most useful as one input among several, not as the final word. Both agents and consumers would benefit from understanding their limitations as clearly as their capabilities.

One Caveat

My sample of 47 properties isn’t statistically comprehensive. Different price points, property types, and markets would produce different results. This was a practical test, not academic research.

But it’s more rigorous than simply trusting marketing claims about AI accuracy. And the patterns—struggling with unusual properties, succeeding with standard stock—are consistent with what other agents report anecdotally.

If you’re relying on these tools for significant decisions, do your own testing. The results may surprise you.