- DBScan is a density-based algorithm
- density = number of points with a specified radius (Eps)
- a point is a core point if it has more than a specified number of point (MinPts) within Eps
- There are points that are at the interior of a cluster
- A boarder point has fewer then MinPts within Eps, but is the neighborhood of a core point
- A noise point is any point that is not a core point or a boarder point.
2. DBScan Algorithm
3. Strongness v.s. Weakeness
- Strongness
- Resistant to noise
- Can handle clusters of different shapes and sizes
- Weakness
- when dataset has varying densities
- high dimensional data
4. Determine EPS and MinPts
- The idea is that for points in a cluster, their k-th nearest neighbors are at roughly the same distance
- Noise points have the kth nearest neighbor at farther distance
No comments:
Post a Comment